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ABSTRACT 

The Retrospective Conversion (RECON) Working Task 
Force investigated the problems of converting retrospective catalog 
records to machine readable form. The major conclusions and 
recommendations of the Task Force cover five areas: the level of 
machine-readable records, conversion of other machine- readable data 
bases, a machine- readable National Union Catalog (NUC) , an 
alternative strategy for RECON and general recommendations. It is 
concluded that: 1) full MARC (machine-readable cataloging code) 
format for machine-readable records is needed to distribute 
cataloging information, but that it is feasible to use less complete 
records for the NUC; 2) the projected cost of converting other 
machine-readable data bases are comparable to or less than those for 
the present per record MARC/RECON cost depending upon the method of 
updating; 3) data bases should be ranked by size and completeness of 
records; 4) standards should be established for reporting the form 
and contents of data bases; 5) automation of the NUC would cut time 
and costs and increase access points to the data. A centralized 
agency is recommended for large scale conversion activity of the 
retrospective Library of Congress records and other libraries. 
(JG) 
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Foreword 



The KECOx (Retrospective Conversion) Pilot 
Pioject was initiated in August 19G9 to investigate 
tlic practical problems of converting retrospective 
catalog records to machine-readable form. At tlie 
same time, the recon Working Task Force began 
its studies of several problem areas related to the 
convei^ion of catalog records. The final report of 
the pilot project Itas been issued separately; the 
present publication describes tlie special studies. 
Financial .uippoi't for these efforts came from the 
U.S. Office of Education and the Council on Li- 
brary Resources, Inc. The library community has 
been greatly IxMieHted by their generosity. 

Tlie rosters of the iiecox Working Task Force 
and the ukcok Advisory Committee remained es- 
sentially as they were for the uecon feasibility 
study; the nnmeo appear on page v. Thanks are 
due tliesc persons and the institutions that allowed 



them to particpate. The Working Task Force 
wishes also to acknowledge the contributions of 
Barbara E. Markusou, a private consultant, who 
made the survey of machine-readable data bases in 
other libraries, and Paul E. Kebabian of the Uni- 
versity of Vermont, who described the problems 
of integrating bibliographic records from various 
sources. Special thanks are due Susan C. Biebel 
of the Library of Congress for her invaluable sup- 
port in all stages of those studies. 

The resnJts of these studies shed new light on 
sevLiral critical problems in library automation. It 
seems imperative that responsible persons and 
agencies study this repoil carefully and take steps 
to develop a national plan for conversion of ret- 
rospective catalog records that satisfies the needs 
of a broad community of users. 



John G. Lorenz, 

Deputy Librarian of Congress 
Ohai^man^ recon Adviso7y Oovirniitce 
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Chapter 1 



Introduction 



Concurrently with the recon Pilot Project, the 
RECOK Working Task Force undertook to consider 
certain basic questions of retrospective conversion 
that are of national scope. 

First, is it feasible to define a level or subset of 
tlie MARC format tliat would allow a library using 
the lower level to be part of a future national 
network ? 

Second, is it possible to use machine-readable 
records from a variety of sources in a national 
bibliographic store as a way to reduce the conver- 
sion effort on the national level ? 

Third, what are the problems of producing a 
National Union Catalog from machine-readable 
records ? 

As these studies and the pilot project pro- 
gressed, it also became apparent that there were 
many practical difficulties in carrying out a large- 
scale conversion project. Therefore, it seemed es- 
sential to investigate alternative strategies for 



RECOX that might yield broad benefits in a reason- 
ably short time span. 

During the early phases of the pilot project, a 
task to study the problems involved in the distri- 
bution and use of name and subject cross-reference 
control records in machine-readable form had been 
outlined by the Working Task Force. This study 
was not initiated because of funding and timing 
constraints and was replaced by the study of alter- 
native strategies. 

The results of these studies are presented in the 
following pages. While some of the findings and 
recommendations are less optimistic than those of 
the original kecox study, it is important to realize 
that they still affirm the need for coordinated ac- 
tivity in the conveision of retrospective catalog 
records. Although li seems impossible to prevent 
all duplication of effort, it is within the realm of 
possibility to keep that duplication to a minimum 
and to achieve a high degree of compatibility 
among records converted in different places. 
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Major Conclusions and Recommendations 



The following" sectiojis give the major conclii- 
sions and rocomint^iidatioiis of the four areas of 
investigation undertaken b}' the kkco>: Working 
Task Force. 

Levels of Machine-Readable Records 

Levels of machine-readable catalog records are 
distinguished by differences in 1) thebibliograj^hic 
completeness of a record, and 2) the extent to 
which its contents are separately designated. The 
findings of this study were : 

1) The level of a j-ecord must be adequate for the 
purposes it will serve, 

2) In terms of national iise, a machine-readable 
record may function as a means of distribaliug 
cataloging information and as a means of report- 
ing holdings to a national union catalog (xuc). 

8) To satisfy the needs of diverse installations 
and applicatio}is, records for general distribution 
should be in the full marc format. 

4r) Eecords that satisfy the nuc function are not 
necessarily identical witli those that satisfy the 
distributio}! fnnctioji. 

5) It is feasible to define the characteristics of a 
machine-readable xuc report at a lower level than 
the full MAuc format. 

Conversion of Other Machine-Readable Data 
Bases 

Machine readable bibliographic data bases do 
exist that could be used to increase the volume of 
the national store under the following conditions: 

1) The per- record cost of converting these records 
to the MARC format, comparing them with records 



in the LC Official Catalog, and npdatijig their con- 
tent to the point v/liere they match those records 
approaches the present per-record i^iarc/hecox 
cost. 

2) The cost of converting the same records if only 
the access points were updated appears to sub- 
stantially lower than present marc/recox costs. 
Tlie mininnun cost of this method of data base 
conversion is probably on the order of one-half of 
present costs. Since these data could not be used in 
this form by the Library of Congress, the question 
of how this effort could be funded remains to be 
resolved. 

3) Should any sncli program be undertaken, the 
high potential data bases should be ranked by size 
and completeness of content of records. However, 
the character of the records would have to be eval- 
uated to determine whether the estimated per- 
record conversion cost held true for any given data 
base, 

4:) A standard should ^^e established for repoi-ting 
the form of inateriaL language, and the content 
of mijchine-readable records in library data bases 
to simplify the job of determining the utility of 
another library's data base. 

A National Union Catalog in Machine- 
Readable Form 

Automation of the National Union Catalog 
using the register/index form would have the fol- 
lowing advantages; 

1) The range of access points to the bibliographic 
data would be extended to titles and series. 

2) AH types of indexes would be cumulated and 
published on the same schedule. 
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3) The time required to produce cumulations 
would be significantly reduced. 

4) Tho cost of the automated system offering these 
advantages for montlily, (luarterly. and annual 
issues would not excf^nl the cost of the present 
manual system. The f 3St of prochicing the quni- 
(luemiial would be .sharply reduced. 

5) The cost of the automated sy.stem would grad- 
ually he reduccil as nmre la:iguages are covered by 
the MARC Distribution Ser'vice. F'urther cost re- 
ductiou.s may be possible as other libraries are able 
to report their lioldings in machine-readable form. 

C) Coirvertiug nuc reports and master index 
records for u()n-M.\i<c records to machine- 

readable form would cieate a data base that could 
bo searched by nonconventional access points (e.g.. 
language, imprint date, geographic area). 

7) The NIX data base might eventually form the 
nucleus of an on-line network of regiomil biblio- 
graphic centers. 

An Alternative Strategy for recon 

There is no ideal strategy for large-scale conver- 
sion of retro.spective catalog records. The critical 
questions of the languages to b( covered, the dates 
of the records, the forms of material, the extent of 
bibliographic information, and tin* details of the 
nuicliino foi'inat yield widely dirt'er*ent arrjwers de- 
pending oji the type and size of the library in- 
volved. Therefore : 

1 ) A centralized agency or component of an agency 
should he established expressly to undertake a 
large-scale conversion activity. This effoii: should 
not diveit the Library of Congress from its i)res- 
ont objective of going forward as rai)idly us pos- 
sible to con\-crr all of its current catalog records 
to machine-readable form. To the extent that ret- 
rospective i-ecords are required for Librar}* of 
(V)ngre.ss puri)()se.s (e.g., (^ird Division mechani- 
zation: s])ecial book catalogs), LC would con\ort 
these records according to its ]>re.sent practices. 

2) The central agency should have two major 
functions : 



a. It iihould undertake a program to convert 
the retrospective LC records that are most in 
demand. Initially, the criterion for selection 
might be those records ordei-ed from the IX- 
Card Division more than a si)ecified nund)er 
of times. 

h. It shoi Id be responsible for adapting ma- 
chine-readable records from libraries other 
than IA\ The scoi)e of this cooi)erative a])- 
l)r()a('h M'ould he modified as each new lan- 
gungo Is covered at L(\ 

In (level()|)iiig its progj-am and carrying out these 
tasks, tlie agency should draw oji tlie experience 
gjiined in the m.nuc and ukcon activities at the 
Library of Congress. Since usei-s will be obtaining 
current catalogs from the Library of Congress, it 
is essential that the products of these two enter- 
prises be entirely compatible. 

'i) To ensure that the conversion of other libra- 
ries' machine-readable data bases results in con- 
sistent record.s, the following i)roceclures are 
reconnnended : 

a. If a library converts, it should u.se the best 
a . ailable LC record. 

h. If at all i)Ossil)le, the full mauc fonnat 
should be used. 

c. The centralized agency should undertake* to 
process records to bring'tliem to the full marc 
format (if necessary) and to make the access 
points compatible^ with the LC Official 
Catalog. 

General Recommendation 

The i)rol)iem of conversion of retrospective rec- 
ords to maohine-readahle foi-m is of concern to all 
tyjH's of libniries in all parts of the country. 
1 hei'efore, the National Conunission for Libraries 
and Infonuation Science should review the ])res- 
ent report as well as the original hk»:ox feasibility 
study t J determine the course of action that is in 
the natioiuil intere.st. The Connui.ssio]? might also 
explore the sources of funds to implemori; its 
reconnncndations for a national prograin for tl e 
conversion of rctrospecti\ e catalog records. 
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Levels of Machine-Readable Records 



This .study rei)orts tlie conclusions readied witli 
res])eet to tlie feasihility of detenniniii^^ a level or 
su])set of the e.stal)lij5lied makc content desi<riuitors 
that would still allow a library usin<r it to he i)art 
of a future national network. 

Definition of "Level" 

During the initial recon study the Working 
Task Force, for discussion ])urposes. considered 
levels of eueodin^r detail of machine-readable cata- 
lo^r i-ecords iji relation to the conditions under 
whieh conversion ini<rlit occur. A level AVas distin- 
^niished by differences in 1) the biblio<rrapliic 
coni])leteness of a record, and '2) the extent to 
which its contents were separately designated. 
With respect to the latter point, the uKfox rcDort 
stated : 

A machine format for recording of hiblioKraphic data 
and the identification of these data for machine manipu- 
lation is comixjsed of a hasic structure (physical repre- 
sentation), content designators (tags, delimiters, snh- 
fiold codes), and contents (data elements in fixed and 
variable fields). Although the haste structure should re- 
main constant, the contents and their designation are 
subject to variation. For example, a name entry could 
he designated merely us a name instead of being distin- 
guished as a personal name or corporate name. When a 
distinction is made, a personal name entry can be fur- 
ther refined n« a single surname, multiple surname, or 
forename, Likewise, if a persona] name entry contains 
date of birth and/or deatli, relationship to the work (edi- 
tor, compiler, etc.), or title, these data elements can he 
Identified or can be treated as part of the name entry 
without any nnique identification. Thus individual data 
elements can he identified at various levels of complete- 
ness.^ 

Appendix F of the rkcox report tentatively 
defined three levels; 

Level 1 involves the encoding of bihliograpliic items ac- 
cording to the practices followed at the Library of (^on- 
gress for currently cataloged items, i.e, the mabc ir 



format. A distinguisliing feature of level 1 ^ die inclusion 
of certain conteni de.signators and do* . elements wliicli. 
in Kome instance^;, can I>o specified ...-ly with the physical 
item in hand. 

Level 2 supplies the sa* xt degree of detail as in level 
1 insofar as it can » ascertained through an already 
supplied lubliogrr*^ ,ic record. . . . 

fjcvel A w< K\\X be distinguished by the fact that only 
part of t^ ' Dibliograpbic data in the original catalog rec- 
ord •■ uld be transcribetl. In addition, content designa- 
> might !>e restricted . . 

At the outset of the present study, however, it 
was recognized that incomplete bibliographic de- 
scription is not ac<*eptable in records for national 
use. In addition, it seonied that the question of hav- 
ing a level below level 2 really arose from a desire 
to define a machine-readable record \\\t\\ ca lesser 
degree of content designation rather than one with 
less bibliographic data. It was decided, therefore, 
to concentrate the study effort on this task, and 
the original formulation of level 3 was discarded. 

On further consideration, it was realized also 
that the distinguishing feature between levels 
1 and 2 was not significant. Omission of data ele- 
ments that cannot lx» determined unless the lx>ok 
is in hand may simplify an individual record but 
does not simplify the content designators in the 
foniiat because tlie.se elements are often jiresent in 
other records. Thus, as far as content designation 
is concerned, levels 1 and 2 (as originally defined) 
were in fact the same. 

Once thi.s similarity became apparent, it was 
recognized that the Si>ecificatioii of levels really 
depended on the fimctions of machine-readable 
catalog records from the standpoint of national 
use. 

Functions and Levels 

On ( lie basis of present knowledge, it seems that 
machine-readable records will serve two primary 



fructions for national use, The first involves the 
distribution of cataloging information in ma- 
chine-readable form for use by librarv^ networks, 
library systems, and individual libraries; the sec- 
ond involves the recording of bibliographic data 
in a national union catalog to reflect the holdings 
of libraries in the Uifited States and Canada. In 
this repoit, the first is called the distribution func- 
tion; the second is called the national union cata- 
log (xuc) function. Each of these functions can 
be related to a distinct level of machine-readable 
record. 

The Distribution Fv/nction 

The distribution function can best be satisfied 
by a detailed record in a communications format 
from which an individual library can extract the 
subset of data useful in its application. At the pres- 
ent stage of library automation, it is impossible 
to define rigorously all of the potential uses of 
machine-readable catalog records. Thus, there is 
no way to predict which data elements may not 
be needed or to rank them according to their value 
to a wide variety of users under different circum- 
stances. 

To confirm the wide variation in treatment of 
the MARC format, an analysis was made of the use 
of jiARC content designators by eight library sys- 
tems and emerging networks. The data from this 
analysis were synthesized for presentation in two 
tables. Table 111 shows the acceptance of content 
designatoi^ in terms of the absolute number of 
libraries using them. It should bo read as shown 
by the following examples: 1) 26 of the 63 marc 
tags are used by all eight libraries; 2) 92 of the 126 
indicators are used by three libraries. Table 3.2 
shows the acceptance of content designators in rel- 
ative terms. Thus, if only three librai ies were using 
a particular tag and all used the associated sub- 
field codes, the acceptance of those subfield codes 
was calculated as 100 percent. In both Tables 3.1 
and 3.2, the columns on indicators and subfield 
codes include responses only from those libraries 
that were definitely using the tag with which a 
given indicator or subf old code was associated. 
The analysis excludes tags for which no immediate 
implementation is planned by the marc Distribu- 
tion Service. 

The major findings of this analysis may be sum- 
marized as follows : 

1) Of 19 fixed fields, 14 were used by at least half 
of the libraries and all were used by at least one 
library. 



Tablf: 3,1 — Use of MARC content designators by 8 library 
systems or networks 



N'umbor of libraries Number o( items 





Fixed 


Tags 


Indi- 


Subrteld 




fields 1 




cators ^ 


codes • 


Total.. . . 


19 


3 63 


126 


181 


8 




26 




1 


7 




6 




88 


6 




3 


2 


45 




1 


5 


7 


15 


4 .. 


b 


3 


9 


9 


3 


7 


2 


92 


11 


2 


4 


4 


16 


9 


1 


1 


7 




3 


None 




7 







1 Only 6 libraries supplied this information. 

2 This column Includes responses only from those libraries that were defi- 
nitely using the tag with which a given indicator or subfield code waj 
associated. 

> Excludes tags lor which no immediate implementation is planned. 



T.\BLK 3.2 — Percentage of acceptance of MARC content 
designators by 8 library systems or networks 



Percent of libraries Number of items 



Fixed Tags Indicators i Subfield 
fields codes ' 



Total 19 2 63 126 181 

100 26 10 

75 to 99 1 9 2 134 

50to74._ 13 8 16 32 

25 to 49 4 6 108 5 

1 to 24 1 7 

0 7 



> This column Includes responses only from those libraries that Were defi- 
nitely using the tag with Which a given Indicator or subfield code was 
associated. 

2 Excludes tags for which nc Immediate Implementation Is planned. 

2) Of 63 tags, 43 were used by at least half of the 
libraries and 26 were used by all of them. Seven 
tags were not used by any of the libraries studied, 
but these tags cover items that will appear in ma- 
chine records produced by the National Library 
of Medicine, the National Agricultural Librai-y, 
and the British National Bibliography. 

3) Of 126 indicators, only 18 were used by at least 
half of the libraries. The highest degree of accept- 
ance was the use of the same two indicators by six 
libraries. On the other hand, each indicator was 
used by at least two libraries. 
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4) Of 181 subfield codes, 176 were used by at least 
half of the libraries that were nsing the related 
tags. Each subfield code was used by at least a 
quarter of the libraries that could express a rele- 
vant opinion. 

The foregoing analysis confirmed the view that 
a nationally distributed record should be as rich in 
content designation as possible. Failure to provide 
this detail would result in many lioiaries having 
to enrich the record to satisfy local needs, a process 
more costly than deleting items selectively. There- 
fore, as of now, the present makc format consti- 
tutes tlie level required to satisfy the national dis- 
tribution function. 

X he National Union Catalog Fv/nction 

As noted above, the xuc function relates to the 
use of machine-readable records to build a national 
union catalog. At first thought, it might appear 
that this function overlaps the distribution func- 
tion. As far as Library of Congress cataloging is 
concerned, this view is correct. It is valid also with 
respect to cooperative cataloging entries issued by 
the Library as part of the card service. However, 
the two functions are quite distinct as far as reg- 
ular reports to xuc are concerned. 

The essential difference between the two cate- 
gories of catalog records is that those issued as 
\jC cards have been completely checked against 
the Library's authority files and edited for con- 
sistency, whereas only the main and added entries 
of xrr reports have been checked for compati- 
bility. The impact of this difference can be judged 
from the fact that an attempt to distribute xuc 
reports as proof slips several years ago was aban- 
doned because the response to this service did not 
justify its continuance. 

Distributing xrc reports in machine-readable 
form would add another dimension to the prob- 
lem of processing them, because, to be flexible 
enough for wide acceptance, xuc reports would 
have to be entirely compatible with those issued 
by the makc Distribution Service. Since compati- 
bility would involve more detailed content des- 
ignation than many libraries might put into their 
i-ecords for local use, libraries would have to be 
willing to provide this detail in xuc reports, or 
the level of xrc reports would have to Le upgraded 
centrally. As the certification of the bibliographic 
data and the content designators would entail a 
major workload for the Library of Congress, it 
does not seem practical to pursue this goal at 
present. 



It is possible, however, to define a subset of con- 
tent designators to cover the eventuality that out- 
side libraries may be able to report their holdin-o-s 
to xrc in machine-readable form. A marc subsot 
can be determined for the xrc function because 
this function involves processing records in a 
multiplicity of places to be used centrally for spe- 
cifically definable purposes. The distribution func- 
tion, on the other hand, involves the preparation 
of records at a central source to be used for a wide 
variety of purposes in a multiplicity of places. The 
difference is vital when 't comes to stating the re- 
quirements for the two types of records. 

The specifications of a machine-readable record 
to fulfill the xrc function depend on the nature 
and functions of the national union catalog itself. 
The content designators for such a record were 
defined in a separate investigation which is de- 
scribed in (^''hapter 5, The present study was con- 
sidered to be compi?ted once the feasibility of 
defining a level of machine-readable record for 
that purpose was establish(d. 

Conclusions 

The findings of this study of the feasibility of 
defining levels of machine-readable bibliographic 
records are as follows: 

1 j The level of a record must be adequate for the 
purposes it will serve. 

2) In terms of national use, a machine-readable 
record may function as a means of distributing 
cataloging i^? formation and as a means of report- 
ing holdings to a national union catalog. 

3) To satisfy the needs of diverse installations 
and applications, records for general distribution 
should be in the full AfARC format. 

4) Records that satisfy the xrc function are not 
necessarily identical with those that satisfy the 
distribution function. 

5) It is feasible to define the characteristics of a 
machine-readable xrc report at a lower level than 
the full MARC format. 
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CkaptjER 4 



Conversion of Other Machine- Readable Data Bases 



Introduction 

A large pool of machine-readable records has 
accumulated as a result of automation projects in 
various libraries. There is wides^n'ead opinion that 
these records could be used to build a national bib- 
liographic data base. The potential benefits are 
thought to be avoidance of duplication of input, 
more rapid creation of a large store, and reduction 
of fhe manpower required to accomplish this task 
with a consequent lowering of the cost. Presum- 
ably many of the records now in machine-readable 
form were derived from zvxarc records distributed 
over the past several years. However, since marc 
has covered only recent English language mate- 
rials, this pool of machine-readable bibliograj^hic 
records must include a large number of titles not 
currently available iu marc. 

Counterbalancing the possible advantage of us- 
ing these records is the fact that they are known 
to vary considerably in terms of their bibliographic 
content and their machine format. Thus, they 
would have to be processed centrally to allow them 
to be integrated with records being produced by 
the Library of Congress. To determine what prob- 
lems would be eiicountered in this processing, the 
RECox Working Task Force undertook to gather 
information about representative machine-read- 
able data bases and to assess their potential for this 
purpose. The task was divided into two phases. 
Phase I was a survey of existing library data bases 
and an analysis of the machine-readable records as 
tv-) bibliographic content and format compatability 
with MARC. Selected data bases became candidates 
for further analysis. Phase II was the analysis 
of the costs and methods (when applicable) to 
utilize data bases from ^•avious sources (selected 
data bases from Phase I) to build a national bib- 
liographic store and a comparison of these costs 
with the costs of recox conversion at the Library 
of Congress. 



Phase I- -Survey of Existing Library Data 
Bases 

MethofJoIogy 

A list of libraries having cataloging data in 
machine-readable form was compiled. Some of 
these were already known to exist; some were 
identified by a review of the library literature. 
The report entitled Book Forni Catalogs: A List- 
ing.^ prepared by the ai-a rtsd Book Catalogs 
Committee, was useful in identifying some of the 
less well-known data bases. 

The following criteria were established for selec- 
tion of data bases to be surveyed : 

1) The data base had to include records for 
monographs. 

2) Data bases known to have fewer than 15,000 
records were excluded. 

3) Data bases known to be entirely or predomi- 
nantly based on marc Pilot Project or marc recoi'ds 
were excluded. 

4) Data bases had to be potentially available to 
RECOX. This eliminated data bases with security 
restrictions and most commercial data bases. 

Xo attempt was made to be exhaustive in identi- 
fying existing data bases meeting these criteria. 
The purpose of the study was to investigate the 
overall problem. If use of outside data bases is 
judged feasible, a more comprehensive survey can 
be undertaken. Thus, failure to consider a par- 
ticular data base does not necessarily mean that 
it might not meet the above criteria and be poten- 
tially useful to RFXos. However, the recox Work- 
ing Task Force is reasonably sure that most data 
bases meeting these crit/^.ria were examined. 



A two-step survey was undertaken. The first 
survey elicited infoninition that would serve to 
discriminate data bastes of low potential utility 
from data bases that warranted further study, A 
general questionnaire requested information on 
availability, size and composition, the (lata ele- 
ments in the catalog record forma^, the character 
set used, and the source of the cataloging data upon 
which the machine-readable input was based. The 
questionnaire was sent to 42 libraries, 33 of which 
responded. 

When the questionnaire returns were analyzed, 
four data bases were judged to be outside the 
scope of the study and seven others were excluded 
from further consideration because some were 
quite small and others contained only brief cata- 
log records. Although other factors were consid- 
ered, the 22 libraries selected for the follow-up 
survey were chosen primarily on the size of the 
data base and the fullness of the catalog record. 
These libraries were asked to submit additional 
information including format documentation, sam- 
ple catalog cards, sample input worksheets, etc. 
The information requested was, in general, sup- 
plied from two different sources. Bibliographical 
materials were supplied by a bibliographical re- 
source person who had been specified by the re- 
spondent on the initial questionnaire; similarly, 
technical data were supplied by the designated 
technical resource responder.f. 

A worksheet was prepared to reduce all the 
documentation provided for each system to a 
standardized form. This worksheet provided for 
a generalized description of data base character- 
istics based on the initial questionnaire response, 
a brief summary of the major features of the for- 
mat, a field-by-field comparison of the local rec- 
ord with the MAKC format, and a sample catalog 
output. 

Analysis of Machine- Readable Formats 

The evaluation of each format in terms of po- 
tential usefulness was made on the basis of sub- 
mitted documentation and, in some cases, by 
follow-up telephone inquiries. Since many of the 
formats were relatively complicated and some am- 
biguities existed, errors in interpretation may have 
been made. It is believed, however, that changes 
in minor details would not affect the major find- 
ings of this study. 

Analysis and comparison of 22 machine-readable 
catalog formats is a sobering experience. The vari- 
ation among them was greater than had been antic- 



ipated. In the beginning, it was assumed that some 
habile patterns would he discovered and that these 
wouhl provide the overall framework for the 
analy.^is. Attempts to discover these basic rela- 
tionships were fruitless, however. In the end, al- 
thougli the format of each data bast* was compared 
to 31 ARC, the data bases were eategorized more 
from the point of view of bibhographical complete- 
ness than according to technical cliaracteristics. 

Analysis was made difficult also by the nature 
of the documentation, the imprecision of termi- 
nology, and the lack of clear data field definitions. 
Each of these points is briefly discussed below. 

The amount of documentation 'supplied for the 
data bases ranged from extremely detailed to ex- 
treme'^ sparse. In most cases the available docu- 
mentation consisted of bits and pieces, but some 
libraries provided well-organized, logical, and uni- 
fied documentation. Generally, the lack of suffi- 
cient documentation was a serious handicap to 
detailed format analysis, 

Both technical and bibliographical terminology 
presented problems. Terms such as "title para- 
graph" were frequently used, but the scope of the 
terms often differed widely. For example, some- 
times the title paragraph included the edition 
statement and sometimes the latter was considered 
as a separate field. Fields named ^*topical subjects,'' 
"subject headings,'' and "'subject tracings" had to 
be examined against the input records to try to 
determine how they were defined for a particular 
format. In only a few irstances were the data 
fields clearly defined with examples, scope, and 
limits explicitly stated. Nonstandard and obvi- 
ously local terminology also presented problems. 

In general, the format descriptions were more 
detailed with respect to those data fields associated 
v;ith control and housekeeping information than 
they were for bibliographical information. In 
some instances the system documentation merely 
i, licated the bibliographical portion of the for- 
mat as 'S-ariable field data." In such cases, the 
variable fields had to be determined from an ex- 
pmination of the sample input and output docu- 
ments and the responses on the questionnaire. The 
danger of this is that the samples supplied pro- 
vide only a limited number o^ records and prob- 
ably do not illustrate the possible range of fields 
included. 

It would be theoretically possible to rank for- 
mats on a weighted basis from "most like marc" 
to "least like," but the analysis would be extremely 
complex and costly because of the large number of 
variables in\^olved. The recommendations in this 
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study are based on a subjective ranking of the 
formats. In making: the recommendations an at- 
tempt was made to keep the following variables 
in mind for each format: completeness of the 
bibliographical data, strncture of the format, size 
of data base, nature of the library, future growth 
potential, proportion of non-MARC records, char- 
acter sets used, and nature of catalog source (e.g., 
local. Library of Congress, commercial vendor). 

For each format, the data fields were compared 
to the MARC format on the basis of the following 
conditions: 1) field present in both formats, 2) 
field not present in the local format and not capa- 
ble of being generated from other data m the rec- 
ord, and 3) field not present in local format and 
judged to be capable of being generated from 
other tagged data in the record or by use of a for- 
mat recognition algorithm. The evaluation did 
not include a fourth condition : data fields present 
in the local T-t^oord and not provided for in the 
MARC format, in most cases, these local fields were 
tagged and it was assumed that a conversion pro- 
gram could strip these elements automatically. 

The data bases were divided into three groups 
after the analysis was completed: high potential, 
medium potential, low potential. It must be em- 
phasized that these value judgments are made 
only with respect to recox needs and are in no 
way meant to reflect on either the quality of the 
lo^^al collection, system, or data base or on the / 
suitability of a particular format for a given li- ' 
brary's needs. Although it is difficult to define rig- 
idly the differences between the three groupings, 
the major characteristics of the dai^a ba.^^es in t^ch 
group are summarized below. 

High-Potenfial Data Barses 

Although this group does not comprise the larg- 
est agg'regate of records, it should become the larg- 
est source of unique titles when some planned con- 
version projects are completed. 

The records are similar to r-C mauc records in 
terms of fullness of catalog entiy, record ^ruc- 
ture, and discrimination of fields and subfields. 
Since most of the formats for these data bases 
were developed during or since the marc projects, 
they incorporate many MARC-like features, e.g., a 
fixed field segment, a record directory, and a vari- 
able field segment. With one exception, the rec- 
ords contain upper and lowercase characters and 
most of the character sets include diacritical 
marks. 
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Medit/))i-PofenfioJ Data Rases 

With one e.vception, the data bases in this group 
contain fewer than 100,000 records and average 
about 50,000 items each. They tend to have fairly 
full catalog entries, although bibliographic notes, 
illustration statements, and size indication are 
usually excluded. Most of the entries are based on 
LC cataloging, edited to conform to local prac- 
tices, and most have LC subject headings and 
many have LC classification numbers. Some of the 
data bases compare favorably with the top group 
in terms of bibliographical completeness but a few 
omit certain fields, such as place of publication, 
series notes, and bibliographical notes. Five of 
the data bases include the LC card numl.- r as part 
of the record although this field is not present in 
every record. 

The distinction between these data bases and the 
high-potential group is primarily attributable to 
variations from the marc format and the absence 
of a rich tagging and coding of the data. The for- 
mats tend to be more sophisticated than those in 
the low-potential category and generally the 
format provides for some fixed field codes in addi- 
tion to the standard cataloging data. Three of the 
libraries have data bases encoded in uppercase 
character sets, but the majority use upper and low* 
ercase characters. 

Lo\r -Potential Data, Easea 

These data bases cannot be characterized by 
size : they range from about 15,000 entries to 
500.000, and. therefore, include some of the larg- 
est data bases for monographs in existence. They 
can be characterized in terms of fullness of mono- 
graph entry. Most of them include only the main 
entry, title (sometimes only a short title), brief 
imprint, and local call number. Even those data 
bases which contain fuller entries usually elimi- 
nate details such as bibliographical notes, illustra- 
tion statements, size, and series notes and tracings. 

The formats used by libraries in this category 
ai'e generally quite simplified. Almost no fixed 
field codes describing the cataloged item are in- 
cluded since the record is usually limited to those 
data fields required for brief entry book catalogs 
and for circulation purposes. Most of these data 
bases are encoded in uppercase character sets. 

Most of the data bases in this group were elimi- 
nated on the basis of the initial questionnaire re- 
turn: a few were eliminated after more detailed 
information was supplied. 
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Findings 

The a^gropato data base of the 20 in-scope rc- 
spondentvS amounts to more tlian 8.720.000 records 
of all types, including ahout 2.500,000 records for 
nionograplis. Of course, tliese figures do not repre- 
sent unique records hut it was ini])0ssible to esti- 
mate the amount of duplication among data bases. 
The cost of producing tliis store of records is diffi- 
cult to estimate precisely as a number of different 
methods were used by a variety of organizations — 
many of which were, of necessity investing a sub- 
stantial part of their efforts in learning the re- 
quired techniques of the field. It seems unlikely, 
however, tliat the a^•erage cost is less than $1.00 
per record. Thus, an investment on the part of the 
library community of several million dollars has 
been expended. These are impressive totals, es- 
pecially when one considers that some librarie? did 
not respond to the questionnaire. 

Table. 4.1 shows the aggregate number of rec- 
ords in each category of data base and the number 
of new records added per year. Annual additions 
to the data base were reported only by those 22 
libraries taking part in the intensive survey. 

These figures are evidence of the tremendous 
expenditure of manpower and money that has al- 
ready gone into the creation of machine readable 
data bas<is. A small segment of the library com- 
munity has been able to create a substantial ma- 
chine-readable data base within a few years. 

From the standpoint of standardized biblio- 
graphic control, the picture is less favorable. The 
bibliographic variations among records can be 
readily seen in T^iblc 4.1. In general, the high po- 
tential group shows the greatest bibliographical 
conformity, but even here there are significant dif- 
ferences as well as many difTerences in format 
structure and tagging. 

It appears that the more recently a data base was 
created, the more bibliographically complete it is 
and the more flexible the format tends to be. Thus, 
the influence of the marc format is beginning to be 
felt. Of the 29 data bases reported, 22 use non- 
MARC formats, 2 use the marc Pilot Project for- 
mat, and 5 use a format based on or identical to 
the marc format or are planning to convert to this 
formut. A few of the non-MAHC formats were (in 
the opinion of the respondents) compatible or con- 
vertible to MARC, but no respondent repoi^ted any 
actual attempt at such a conversion. 

No data base was discovered that was identical 
to the LO MARC data base from a technical view- 
point although some are nearly so. "Most data bases 



Tahlk 4.1 — Characteristics of ^9 machine-readable biblio- 
graphic data bases, by RECOX use potential 
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MARC _ 
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LC full cataloging 
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LC cataloging modified. 
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Local-full cataloging 
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Local-brief cataloging. _ 
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4 
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MARC Pilot Project 
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MARC 
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Language: 








English 
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Foreign 
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Title... 
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Short title 
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Edition 
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Imprint-full 
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Imprint-brief 
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11 
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Series notes. 


11 
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Other notes 
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LC subject 


9 
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1 
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10 
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Uppercase only 


1 


3 


7 


Upper and lowercase 


10 


5 
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Diacritics... ._ 


7 


3 


1 
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depart from standard LC catalog information by 
modifying some data fields locally and adding or 
eliminating others. 

Two high poti^ntial data bases with a high de- 
gree of compatibility with marc and a medium po- 
tential data base that differs from marc both in the 
level of content designators and in bibliographi- 
cal comp]et<?ness were selected for the Phase II 
analysis. 
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SURVEY PARTICIPANTS 



Programming Costs 



Abel, Richard & Co., Inc. 

Air Force Cambridge Research Laboratory Library 
Austin (Texas) Public Library 
Baltimore County (Maryland) Public Library 
Black-Gold Cooperative Library System, Ventura. 

California 
Chester County (Pa.) Library System 
Cuyahoga (Ohio) Community College Library 
Eastern ?'ew Mexico University 
Georgia Institute of Technology Library 
Harvard University, Widener Library 
Honnold Library for the ClaremGci Cuiieges. C'laremont, 

California 

Indiana University of Pennsylvania. Indiana, Pa. 
Jefferson County (Colorado) Public Library 
Montgomery County (Maryland) Library 
National Agricultural Library 
National Library of Medicine 

New York Public Library (Branch. Dance. & Research) 
Redstone Scientific Information Center 
San Antonio College 

Stanford Univer'^ity Undergraduate Library 

SUNY Upstate Medical Center Library 

Tennessee State Library 

University of California, Santa Cruz 

University of Califor: 'a, Union Catalog, ILR 

University of Chicago Library 

University of Vermont, Dana Medical Library 

Vancouver Island (British Columbia) Regional Library 

Washington State Library 

Yale University Medical Library 

Phase II — Analysis of Costs and Methods 

The attraction of using machine-readable 
records from other libraries as input to the lc 
MARC data base lies largely in the fact that such a 
procGdure would eliminate at least the need for 
original keyboarding of the record. Against this 
obvious saving in labor, tuie has to measure the 
costs of acquiring the data base from the originat- 
ing library, converting the format to marc, search- 
ing the LC Official Catalog, and updating the 
records. The purpose of this phase of the study is 
to analyze what is required to convert the data 
bases of two representative libraries selected as the 
result of Phase I and to estimate the cost of doing 
so. 



A fixed cost of using ♦ Mi non-M.vRC data base 
for the purpose of adding i cords to the lc marc 
data base is the cost of programming to convert the 
record format of the given data base to lc marc 
format. This cost is not strictly fixod as the larger 
the data base, the more worthwhile it might be to 
add certain features to the program to take care 
of special cases that for smaller data bas«s might 
be more economically done manually. In addition, 
the experience gained as each program is wj-itten 
should tend to reduce the effort and. therefore, the 
cost of subsequent programming. However, in this 
analysis, the primary cost of programming format 
modifications is considered to be independent of 
data base size and experience. 

There are two methods of converting the records 
of an outside library to the marc format: 

1) The data can be converted directly to the marc 
format. 

■2) The data can be convert^ to the input format 
required for the LC format recognition programs 
(data strings without content designatoi^) . 

It would appear that the choice of method would 
depend on the degree of similarity l^etween the 
input data base and the marc format. However, 
even with a close approximation to the marc for- 
mat, it might pay to use format recognition 
whenever the bibliographic data (including punc- 
tuation) are taken from IX) catalog cards because 
tht sophistication of the format recognition pro- 
grams enables them to perform with remarkable 
accuracy. Since much of the same developmental 
work might be necessary to write a program to con- 
vert an input tape in another format to marc, such 
a program would be costly to develop. Although a 
program to convert a record to the input format re- 
quired for format recognition may be complicated 
by the tagging scheme of the other library's rec- 
ords, the fact that the format recognition programs 
are operational offers the possibility of a great sav- 
ing in programming time. It should be emphasized 
that the success of such a program is dependent on 
the degree of explicit identification of the data ele- 
ments in the input record ; that is, the extent to 
which they resemble marc content designators. 

In the final analysis, each data base must be stud- 
ied individually before a method of conversion can 
be chosen. Since the problems associated with 
either approach have similar characteristics, it was 
assumed' that the technique would l)e conversion to 
the input format required for format recognition. 



11 



Published information on the cost of designing, 
writing, and testing computer programs is sur- 
prisingly sparse. Dolby ^ reports that a primary- 
factor in estimating programming cost is the size 
of the program because the cost tends to increase 
as the square of the size of the program rather 
than as a linear function. Theoretical support for 
such an argument can be made by observ^ing that 
the number of possible interactions between pairs 
of inetructions (and hence, possible program 
errors) increases as the length of a program in- 
creases. Thus, the well-known advantage of modu- 
larizing programs through the use of subroutines, 
macros, etc., can be explained by the fact that 
modularity reduces a large program to a sequence 
of small subprograms, which has the efPect of 
reducing the number of interactions among 
instructions. 

Aron mentions four techniques in his article 
"Estimating Resources for Large Programming 
S3'stems." ^ Two of these techniques, the Constraint 
Method and the Units of Work Method, are not 
applicable because they vary or subdivide the task 
to fit the available manpower. The other two, the 
Quantitative Method and the Experience Meth9d, 
are worth considering for this study. 

The approach used in the Quantitative Method 
is to estimate the size of a desired program (in- 
structions or lines of code) by comparing program 
requirements with ^hose of similar projects. For a 
large project, individual estimates for sub-unit at 
various levels can be used. The total number of 
instructions is subiivided into three classes: easy, 
medium, and diffi H. The number of man-hours of 
programming reqi ired is computed per man day 
at the rate of 20 nstructions for easy program- 
ming, 10 instruct! ms for medium programming, 5 
instructions for difficult programming. The hours 
of direct labor obtained this way are adjusted to 
allow for supervision and other overhead factors 
and converted into costs by applying appropriate 
rates. 

The approach used in the Experience Method 
is to estimate the cost, size, and time requirements 
of a programming project by comparing it with 
similar previous ones. Although this method is 
inexact, it is widely used because of its practical- 
ity. Size and man-hour figures are available for a 
program that has been written to convert marc 
records to an input format to test format recogni- 
tion processing. This program converts a marc 
record to a record containing data strings without 
content aesignators. When this record is proc^^ssed 
through format recognition, the result should be 



a record identical to the one originally converted 
to data strings. The program was written in As- 
sembly Language Coding; it contains 1,520 lines 
of code. It took 555 man-hours of programming. 
Assuming a cost of $18.00 per hour for contractual 
programming, the program cost was approxi- 
mately $10,000. ^Vhile it is true that the number of 
lines of coding and ultimately the cost of the pro- 
gram depend on the programming language used 
and the competence of the programmer, it was felt 
that the direct experience with an almost identical 
problem justified using the LC estimate. 

Processing Strategy 

The steps by which the conversion of another 
library's data base would proceed depend on the 
end in view. Two objectives are possible : 

1) To obtain a record in which the access points 
would be identical with those on the record in the 
LC Official Catalog but the other data would be 
as furnished by the library that created the record. 
This approach assumes that the record is essen- 
tially bibliographically complete. 

2) To obtain a record that is identical with one 
in the LC Official Catalog; such a record would be 
the equivalent of one supplied by the lc marc 
Distribution Service. 

In devising methods to achieve these objectives, 
it was assumed that records which have no match 
in the LC Official Catalog would not be added to 
the file. Attempting to edit non-LC records for 
inclusion would add a major cost as is shown by 
the study of the lequirements for a national union 
catalog in machine-readable form (see Chapter 
5^). The decision to disregard nonmatching records 
in the present study was made on practical 
grounds: there is no convenient way to estimate 
their proportion in any given data base. 

For the method to achieve Objective 1, the fol- 
low^ing assumptions were made : 

1) Main, added, and subject entries would be 
changed as necessary to make them identical with 
the corresponding LC headings used for the same 
record. 

2) Data on an LC card that are lacking in the 
other library's record would not be added, except 
for LC card number, international standard book 
number (isbn), and the LC call number. 

3) If the other library's record has access points 
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not on the LC card, they would be excluded. 

4) Proofing would amount to 50 percent of the 
cost of proofing regular marc/recon records be- 
cause proofing will be primarily to detect format 
recognition errors and to confirm the accuracy of 
the data in the access points. 

5) The cost of correction typing would be equal 
to that of making corrections on marc/recon rec- 
ords because it is assumed that catalog comparison 
of access points will not result in any more changes 
than are now made on rfxon records and that 
other types of errors (e.g., typographical errors, 
format recognition errors) would be at the same 
level. 

6) The cost of verification (the final reading for 
content) would be 50 percent of the cost of verify- 
ing marc/recon records because only t^e accuracy 
of the content designators and the primary access 
points would have to \ye verified. 

For the method to achieve Objective 2, the fol- 
lowing assumptions were made : 

1) All data on LC records that was lacking in 
the other libraries would be added. 

2) Data elements not on the LC card would be de- 
leted; data elements that differ would be changed 
to match the LC card. 

8) Although the basic proofing cost would be the 
same as the present marc/recon cost, ensuring 
that the entire record matched the LC record in 
every detail would involve extra expense. No at- 
tempt was made to estimate this cost because it 
would depend on whether the proofing was done at 
the Official Catalog or at a later stage from a copy 
of the LC record. It is probable, however, that the 
cost of performing this task would offset the saving 
of the cost of original keying. 

4) The workload (and, therefore, the cost) of 
t>Ti^l? corrections would be twice that of Method 
no. 1. 

5) The cost of verification would be the same as 
the present marc/recon cost. 

It will be observed that the costs in this method 
are essentially those of marc/recon; only the 
cost of original keying would be saved. This is 



because certification of an outside library record 
as an LC record requires the complete proofing 
and verification process and the proportion of cor- 
rections (i.e., fields added or changed) would al- 
most certainly be greater than is true for a record 
converted directly at the Library of Congress. 

As noted above, in processing an outside librarj' 
data base, it is necessary to elimina^^ci all records 
already in the marc data base as well as those out- 
side its present scope. The procedures for doing 
this would be heavily dependent upon the nature 
of the data base being processed. 

For the purpose of this analysis, records eligible 
for selection from a data base may be defined as 
records represented in the LC Official Catalog, but 
not yet included in the marc data base. Ineligible 
records include those that duplicate existing marc 
iwords, records not represented in the LC Official 
Catalog, and records for forms of material not yet 
included in the marc Distribution Service. Lan- 
guage would not be giounds for dwlaring a record 
ineligible. Identification of ineligible records 
should be done by computer when feasible but, in 
numy cases^ eligibility can l)e detennined only by 
manually checking the records against the LC 
Official Catalog. 

The cost of machine searching varies with the 
amount of manipulation of fields required to de- 
rive the search argument. Manual searching 
against the Official Catalog costs approximately 
$.10 per record. Since the rental cost of the lc ibm 
360/40 configuration is $27,767 per month or $.0438 
per second, based on 176 hours per month, che man- 
ual searching cost is approximately equal to the 
cost of 2.3 seconds of machine time. This exceeds 
the time required even for a relatively complex 
machine search. Therefore, machine procedures 
should be used, wherever possible, to decrease the 
number of manual seai^les required against the 
Official Catalog. 

The following statements suggest various tech- 
niques that might be used. It may be expedient to 
process the source data against the marc data 
(matching on IvC card number, if available, or an 
author/titJe search code) to eliminate records al 
ready in the marc data base. Again, depending on 
the characteristics of the source data base, auto- 
matic algorithmic deletion by language and im- 
print date (e.g., English* language records with 
publication date 1968 or later) might be an alter- 
native technique to searching to eliminate records 
already in marc. Likewise, if the source data came 
from a library thai used LC cataloging data when 
availcvole and always included the LC card number 
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in machine-readable records, it might prove ex- 
pedient to automatically delete records without LC 
card numbers based on the assumption that the 
lack of an LC card nninber in the source record 
would be good evidence that the record \vo\ild not 
appeM' in the Official Catalog. This teclmique 
would reduce n^anual searching with only slight 
risk of losing records tliat sliould have ben in- 
cluded. Naturally, the validity of this assumption 
would have to be tested in any given situatitm. 

Tt nnist be kept in mind that the expected yield 
of ineligible records as the result of a niichine 
search has an important bearing on whether it 
should be miide. For example, comparison on LC 
card number against the m.arc data base would 
not be economic for a large nitrospective (i.e., be- 
fore 1968) data base even if LC card numbers 
were given, because the disk lookup agains' a 
table of LC card numbers for each record in die 
file would yield few records for a large cost. In 
this situation, algorithmic deletion by language 
and publication da-e might be more approi>riate. 
Assuming that the characteristics of the data bases 
are known by their creators, the need for i\v indi- 
vidual determination of the particular strategy 
for eliminating records for that data bafe should 
not pose a major problem. 

Depending on the characteristics of the data 
base, the automatic deletion of ineligible records 
might precede or follow the conversion of the rec- 
ords to the input format required for format 
recognition. Records would be processed tlirough 
format recogrntion, printed, searcl:ed to delete 
records that are not eliminated automatically, and 
(when eligible) compai-ed manually against the 
matching LC records. After proofing for the ac- 
curacy of format recognition, the records would 
be modified to reflect corrections of content desig- 
nators and any data changed as a result of cata- 
log comparison. The records would 1^ printed 
again for proofing and final verification prior to 
being added to the marc data base. 

P?y7redure Costs 

Computer run costs have not been included in 
the cost estimates. Analysis indicates that the per 
record cost of original conversion at LC (input 
format from the DigiData converter to input for 
format recognition) is approximately equal to ti;e 
cost of converting a record from another data base 
to the input format for format recognition. Since 
these costs are not reported as pai-t of the kecox 
cost e,sti mates, they have not been included as part 
of the present estimates. 



It is recognized that some records processed in 
this phase of the conversion may be discovered to 
be ineligible at a later stage. Strictly speaking, the 
cost of processing these ineligible records should 
be prorated among the eligible records. This has 
not been done, however, because there is no valid 
way to estimate the percentage of ineligible rec- 
ords discovered this way and. in any case, the in- 
cremental cost would probably be insignificant in 
relation to the total average conversion cost. 

Progranuning cost is a function of the number 
of usal)le records and nuist l)e apportioned accord- 
ingly. The cost of progranuning to convert rec- 
ords of a largti research library can be expected to 
be amortized over an indefinite period since such 
a data base will continue to yield records of value 
to a national data base. On the other hand, the 
cost of programming to convert records of a smal- 
ler library may warrant appoitionment only over 
the number of records in a one-time conversion 
effort, if that library's future acquisitions are un- 
likely to contribute significantly to the national 
(lata base. 

Representative Conversion Costs 

Tl'.^ (Toneralized conversion strategies were ap- 
plied to two high potential data bases identified 
in the survey described earlier in this chapter; 
^hoy were the University of Chicago Library and 

iC Research Libra I'ies of the New York Public 
Library. When the medium potential data base was 
examinecl, it was found that the records were not 
bibIiog/-ap)ncalIv complete enoiigli to be suitable 
for a national data base* without adding or chang- 
ing many (lata elements. Thus, Objective 1 would 
not be approjiriate for this data base and Objective 
2 would cnt^iil costly uj^lating. Therefoi^e, it was 
decided to limit the oost analysis to the two high 
potential data bases. 

Tabic 4.'2 shows the basic characteristics of these 
two data bases. Since both of them contain records 
that aiv substantially like those in the LC Official 
Catalog, it is feasible to consicler both of the con- 
version objectives that have been described earlier. 
The basic steps for the University of Chicago 
Library data base w(mld be as follows: 

1) Eliminat^j by machine every record with fixed 
field indicating that it was taken from the. m.\U(- 
data base. 

2) At same time, eliminate other records with 
language code for English and imprint date of 
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Tahlk 4.2 — Characteristics of two high potential data bascs^ December 1971 



rhara<'t<»ristic 



T'lJivorsity of Chicago Library 



New York rnlilic Library 



Size of data ba^e _.- 175,000-_ 16,000' 

Annual growth 30,000 to 40,000 65,000 

Percentage of records taken from \.c marc 17. _ _ 15-20 

Percentage of records ia English - 40 to 60 __ 4.'> 



Percentage of nonmonograf)hic records 2 

Kind of cataloging data Full entry, similar to LC. 

LC' card number liresent _. . - - - 

Format -- - -- Based on iss planning memoran- 

dum Xo. 3 ^; detailed field and 
subfield identification. 

Indication that ^-ecord was taken from lc marc. Yes 



20 

Full entry, similar \o LC 
Yes 

MARC with modifications 



No 



' At the time of this analysis, tht kvpl system had only been in full opi-ra- 
tion for I monih. 

2 Approxhnatplv M of the records do not have LC card numbers. The 
T'nlversity of Chk ago Library does not attempt to supply LC card numbers 



for records locally cataloged or records for which cataloging Information is 
obtained from other than LC card sources. 

' Avram, Uenriettc D., Froitas, Kuth 8,, and Ciulles. Kay I")., .4 Propoud 
Format for a Standardizfd Machim-Readabht Cataloo Ftcord, Lihrary ol Con- 
gress Report, June 1»65. 



or latei". Automatic deletion of records with- 
out LC cai'd numbers would not be desirable be- 
cause the absence of a number is no guarantee tiiat 
the item was not also cataloged by the Library 
of Congress, 

3) Convert remaining records to input format for 
formr.t recognition. 

4) Process r ecords by format recognition program. 

5) Print records. 

6) Search records against LC Official Catalog: 

a) to identify other ineligible records. 

b) to compare eligible records against match- 
ing LC records; for Objective 1, this involves 
checking only access points; for Objective 2, 
it involves checking th*^ entire record. 

7) Proofing for format recognition. 

8) Updating to correct format recognition errors 
and to make rhaiiges dictated by catalog com- 
parison. 

9) Second proofing and final verification. 

Essentially, the same steps would l)e followed 
for the New York Public Library records, except 
for the means of eliminating ineligible records by 
machine. Since LC card numbers are present 
w^henever the cataloging data were taken from an 



LC record, there is a reasonable expectation that 
records without LC card numbers could be elimi- 
nated at the earUest possible stage. However, the 
absence of an indicator that tlie record was taken 
from MARC makes it necessary to use the language/ 
imprint date algorithm to distinguish ineligible 
records among the records with card numbers. 

The estimated manpower costs per record for 
concerting these data bases are shown in Table 
4.3. Also to be taken into consideration is the one- 
time cost of the program to convert records into 
the input format required for format recognition. 
The actual program cost per record would depend 



Tablk 4.3 — Manpower coats for different conversion methods 



Original Conversion of other 
Kunctlon conversion library dnta bases 

at LC 

Objective I Objectire 'i 



Catalog comparison 


$0. 


19 ' 


' $0. 19 


Proofing 




58 


. 29 


Original typing 




24 




Tvpir.g corrections 




08 


. OH 


\*erifving_ . , _ . 




^9 


. 30 


Other duties * and leave 


1. 


17 


» . .59 


Total _ 


O 


8.5 


1. 40 



• Base cost; the cost would increase if a significant numlx-r of Ineligible 
records w ere identified at stage. 

2 Haso cost; tho cost of tDroofing to ensure aii exact match with the LC rec- 
ord was not esUmated but would make thf t*r rrcord cost appreclalUy 
higher. 

' The best available evidence Indicates the typing of corrections accounts 
for 25 percent of the total typing cost for mahcbecos. 

* Includes supervision, training, and clerical activities. 

i This L'ost Is dire<'tly dependent on the costs of tht basic functions; there- 
fore, it fluctuates with them. 
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on tlie number of eligible records in a data base. 
For example, if the program costs $10,000 and 
the data base yields 50,000 records, the per record 
cost is $.20. In the case of the two data bases se- 
leoto<:l, it was assumed that the program would be 
useful over an indefinite period and, therefore, 
that in the long run the per record cost should 
he quite small relative to the total conversion cost. 

For Objective 2, another factor must be taken 
into account. The number of data elements that 
must be added or changed affects the cost of cor- 
rection. An investigation of two samples of rec- 
ords showed that the T^niversity of Chicago rec- 
ords lack a few basic LC data elements that are 
used in the New York Public Library records. 
This means that, potentially, Chicago records 
would be more costly to bring to the level of LC 
records. However, because the %^olume of correc- 
tions affects typing, the smallest of all manpower 
costs. It did not seem worthwhile to estimate the 
slight differences in costs that might result from 
conversion for Objective 2. 

The costs for each of the data bases are shown 
as identical because the estimate of eligible rec- 
ords in each data base was made on the assump- 
tion that only that data base was being converted. 
If several data bases were converted, it can be 
assumed that the percentage of eligible records 
in each new data base would dwindle. This would 
have the effect of increasing the per record cost 
and making the per record programming cost a 
more significant factor. 

A^ystem Consider atioii8 

ITse of other data bases to increase the volume 
of a national bibliographic store would involve 
several hidden costs, not estimated in the preced- 
ing sections. These costs would relate to liaison 
with a number of organizations and the analysis 
of many file structures with varying degrees of 
associated documentation. Therefore, a best strat- 
egy should be developed in terms of cost as well 
as utility. The various data bases should bo ranked 
according to size and bibliographical complete- 
ness with approximate estimates of proportions of 
eligible records (non-MARC vs. marc records, etc.). 
Data base conversion should begin srith the highest 
ranking file. However, once a few large TiIps have 
been converted and put into the national store, the 
yield from other data bases might tend to be^ low 
as to drive the per record cost of the conversion 
program too high for economic feasibility. 

Any thresholds chosen at this time as to mini- 
mum size of data base and length of the record 



would be quite arbitrary. What is considered a 
threshold value would, in the end, depey)d on the 
form and/or language of the material and when 
the data base was being considered for conversion. 
For example, if a machine-readable data base of 
some 50,000 motion picture and filmstrin records, 
meeting appropriate format and bibliographical 
criteria, were to exist in 1975, it could be consid- 
ered a candidate for a national store of machine- 
readable records for that form of material be- 
cause the Library of Congress plans call for ini- 
tiating such a service in fiscal 1973. 

It has already been noted that libraries can be 
expected to know the characteristics of their own 
data bases. In terms of the national interest, it 
would be useful to consider establishing a stand- 
ard for recording and publishing information 
about the form of material, the language and the 
content of machine-readable records in library 
data bases. Such a standard should simplify the 
chore of determining the utility of a data base and 
also make available to the library community as a 
whole detailed specifications for each individual 
library's data base. This standard should be de- 
veloped under the auspices of the American Na- 
tional Standards Institute, Committee Z-39. 

Data Ba^e Acquisition Cost 

The question of what charges (if any) should be 
made for the use of a library's machine-readable 
data base for a national bibliographic store must be 
considered. The cost of copying the file as well as 
that of purchasing the tapes necessary for the in- 
terchange might represent a minimum charge. 
Such a charge would be a fmction of a cent per 
eligible record. It may be questioned whether a 
library should recover part of its production costs 
in such a transaction. It could be argued that the 
recompense for contributing records to the na- 
tional store should be measured in terms of a li- 
brary's future use of the contributions of other 
libraries. Furthermore, contributions of original 
cataloging made on a continuing basis might con- 
ceivably substitute for reports to the National 
Union Catalog. 

The situation is further complicated by the fact 
that several commercial firms purchase marc tapes 
on a regular basis to provide cataloging services 
for the library community. The fact that a con- 
tributioli to the national store represents a con- 
tribution to profit-making organizations may act 
as a deterrent to the transfer of these files on 
a cost-only basis. Profit-making organizations 
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mipht acquire individual library files directly. In 
this evoit, the library could rec^over a substaiitiul 
])j\rt of its invesTinent by making the tile available 
at a cost in exr*oss of the duplication and tape pur- 
chase costs. Karly release of library riles to the 
national store could conceivably reduce the poten- 
tial income to tlie individual libraries as some 
prospective buyei^ mi^bt await distribution of 
the records through tlie marc service. Considera- 
tion on the part of the commercial firms will cer- 
tainly be given to the length*of time required for 
LC to process all of the data. The economic im- 
plications of this kind of data transfer should be 
fully investigated in the near future. 

Conclusions 

This study led to the following conclusions : 

1) Machine-readable bibliographic data bases do 
exist tliat could be used to increase the volume of 
the national store. This study indicates that the 
per record cost of -converting these records to the 
JIARC format, comparing them with records in 
tho LC Official Catalog, and updating their con- 
tent to the point where they match those records 
approaches the present per record karc/recox 
cost. 

2) The cost of converting the same records if only 
tlie access points were updated appears to be sub- 
stantially lower than present marc/recov costs. 
The minimum cost of this method of data base 
cojiversion is probably on the order of one-half 
of present costs. Since these data could not be 
used in (his form by the Library of Congress, th** 
question of how this effort could be funded re- 
mains to Ik? resolved. 



Should any program be undertaken, the high 
potential data bases should l)e ranked by size and 
c()m]deteness of content of records. The highest 
lankiiifr data base should be the first to be con- 
verted. Early cojisideration might be given to the 
l^niver^sity of California libraries file containing 
approximately 750,000 records. However, it 
should be noted that the character of the records 
would have to be evaluated to determine whether 
the estimated per record conversion cost held true 
for this data base. Lack of the necessary informa- 
tion made it impossible to make an analysis at 
the time of this study. 

4) A standard should be established for reporting 
the form of material, language, and the content 
of machine-readable records in library data bases 
to simplify the job of determining the utility of 
another library's data base. 
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Chapter 5 



On the Implications of a National Union Catalog in 
Machine-Readable Form 



Introduction 

In the simplest terms, the development of the 
National Union Catalog involves combining Li- 
brary of Congress catalog records witJi those of 
other libraries to prodnc(^, a file of discrete entries, 
posting to this file information abont dnplicate 
holdings in other libraries, and providing the re- 
sulting information in ways calculated to satisfy 
the needs of the library community. Tlie accom- 
plishment of this task entails many bibliographic 
and technical problems ^vhich are aggravated by 
the volume of information that must be processed 
to produce the end result. 

The bibliographic problems relate to processing 
reports to build the basic file. They involve : 

1) Identifying new titles. 

2) Making the fomis of names in main and added 
entries on non-LC cards compatible with names in 
the LC Official Catalog, (See Appendix A for a 
discussion of the problem of compatibility.) 

3) Posting new locations to existing records. This 
task is classed as bibliographic because it arises 
from a search to determine whether a title is new 
to the National Union Catalog. 

4) Providing necessary see and see-also references. 

5) Updating bibliographic records when addi- 
tions and corrections -are received. 

The technical i^roblems relate to tlie means of 
disseminating information from the file. Four cri- 
teria are posited to assess the merits of the means 
of dissemination : 

1) Completeness: the full bibliographic record 



must be given in at least one readily available 
source. 

2) Curretxoy: the information should be made 
available as soon as possible on a regular schedule; 
listings of new titles should appear at least once 
a month. 

3) Convenience: both the format of the published 
information and the frequency of its cumulation 
should facilitate the work of bibliographic 
searching. 

4) Cost: the cost of publishing the information 
should be kept as low as possible so that the Na- 
tional Union Catalog can be widely distributed. 

It is readily apparent that considerations of cost 
have an important beam ing on the extent to which 
the other criteria can be satisfied. Acceleration of 
the frequency of publication, broadening of the 
cumulation pattern, and improvement in the 
physical format are all directly related to the cost 
of producing a printed catalog. Thus, in the final 
analysis, decisions must be made as to whether 
optimizing currency and convenience justifies the 
cost of doing so, especially when the cost must 
eventually be borne by subscribers to the catalog. 

The following statistics reveal the magnitude of 
the labor required to produce the National Union 
Catalog for 1970; 

1) Approximately 226,000 LC catalog records were 
added. 

2) Approximately 108,000 discrete titles cataloged 
by other libraries were added. Actually, a larger 
number of these reports were prepared for the 
monthly catalogs, but in the annual cumulation 
some were replaced by LC records issued at a later 
date. 
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3) Approximately 1,000,000 outside library re- 
ports had to be searched and a further 1,500,000, 
which were immediately identifiable as reports of 
additional locations, were forwarded for posting 
to the file, (See Appendix B for details on xlx' re- 
porting.) 

The principal vehicle for making this tremen- 
dous mass of information available is the National 
Union Catalog : A Cumulative Author List. This 
book catalog is issued in what can be broadly de- 
scribed as monthly, quarterly, annual, and quin- 
quennial cumulations; the actual pattern of pub- 
lication will be described fully in a later- section. 
The arrangement is by main ?nd added name en- 
tries. Except for titles that are main entries, there 
is no access by title or series. Full bibliographic 
information appears only under the main entry; 
added entries take the form of references to the 
main entry. The main entry records consist of LC 
catalog cards and especially typed versions of out- 
side library reports. Added entries and name refer- 
ences are typed on cards unless (in the case of new 
name references) a printed reference is available. 
All of the cards are arranged in one sequence, 
mounted in three columns on large pieces cf card- 
board, photographed in a reduced size, then 
printed and bound by standard methods. The 1970 
annual cumulation of this catalog consisted of 14 
volumes, comprising approximately 13,000 pages. 

The second major component of the xrc is the 
Library of Congress Catalog — Boohs: Subjects. 
This catalog is limited to LC catalog records be- 
cause the cost of editing outside library reports to 
provide -consistent subject headings would be 
prohibitive. Despite this restriction, this catalog 
does provide partial subject access to xuc because 
the majority of LC items are held by other librar- 
ies. Ii 9^0, the annual cumulation of the subject 
catalog <iquired 5 volumes, comprising approxi- 
mately 9,000 pages. 

The third component of the xuc is the Register 
of Additional Locations^ containing location re- 
ports that were received after a catalog entry has 
been printed in an annual cumulation. Because of 
the huge number of these reports, they are grouped 
by the year the original bibliographic record was 
prepared and issued in segments. Thus, the 1970 
issue of the register consisted of two volumes, 
covering LC cards and nug repoits dated lt)64 
and 1965. 

As the coverage of the imakc data base grows, 
and as the capability of local input is added at the 
regional level at such centers as the New England 
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Library Network and the Ohio College Library 
Center, the concept of on-line union catalogs is 
fast becoming a reality. It does not follow, how- 
ever, that knowledge so far gained from tlie very 
]i?nited, largely conceptual, experience with ma- 
chine-readable union catalogs can be extrapolated 
to the much larger, more complex system to pro- 
vide on-line access to the National Union Catalog. 
It appears safe, therefore, to predict that, for some 
time to come, we will make use of xrc information 
in book form or microform. However, the growth 
of the KARO data base does make it feasible to pro- 
duce these catalogs by computer, thereby relieving 
humans of much of the drudgery of preparing the 
catalogs and, at the same time, offering the possi- 
bility of additional access points to the biblio- 
graphic information. 

Therefore, it was logical to include in the rkcox 
studies a preliminary analysis of what would be 
involved in the production of the xuc from cata- 
loging data in machine-readable form. The aim 
was only to consider the bibliographic and tech- 
nical implications of a machine-readable xuc data 
base as a foundation for future investigations. The 
magnitude of the problems and the constraints of 
time, funds, and manpower available to the /:ask 
force precluded formulation of a detailed system 
design with associated cost estimates. 

Design for a National Union Catalog 

Since the results of the recox Pilot Project con- 
ducted at the Library of Congress make it unlikely 
that an}' large-scale retrospective conversion effort 
will be undertaken in the near future, this study 
concentrated on an Nrc for current materials. 
Problems of including retrospective records and 
their locations were not taken into account. 

In considering the optimum format for a Na- 
tional Union Catalog produced from machine- 
readable records, the recox Working Task Force 
selected the register/index form of catalog because 
it allows favorable cumulation patterns and more 
points of access without having to print a full 
bibliographic record more than once. Basically, 
this type of catalog comprises: 

1) A register of complete bibliographic entries ar- 
ranged by numbers assigned as each item enters 
the system. Register volumes are issued regularly 
but they are never culmiiiat^d. 

2) Indexes providing various access points derived 
from the register entry. The index entry includes 
a brief bibliographic identification and the register 



number together with any other data that may be 
desired. The indexes may be in dictionary form or 
divided into sections (e.^., name, title, subject) . An 
index volume is issued with each register volume 
and the various indexes are cumulated regularly. 

The future xrc, as conceptualized by the recon 
Working Task Force, would have the following 

inder:;S : 

1) Xame index : personal and corporate names used 
as main, suoject, and added entries (including au- 
thor/title series). 

2) Title index: titles used as main, subject, and 
added entries (including uniform title headings 
and series entered under title) . 

3) Topical subject index (including geographic 
subject headings). 

Figure 5.1 shows che entry elements and their 
MAKC tags for each of the indexes. Table 5.1 gives 
the order of data elements in eac'h type of index 
entry. Index entries under adc'ed and subject en- 
tries would include the full form of main entry. 
Each index entry for an LC record would include 
the LC card number. The index entry for the m^in 
entry would include all locations reported to the 
date of the cumulation. Figure 5.^. presents ex- 
amples of register and index entries for a typical 
bibliographic record. 

Although ind-^x entries would be designed to be 
complete for many purposes (e.g., initiating an 
interlibrary loan, obtaining an LC card number), 
it would soL.etimes be necessary to check the regis- 
ter volume to obtain the full bibliographic infor- 
mation (as in cataloging). The disadvantage 
of double look-np is minimized by the fact that 
the second search by register number is straight- 
forward. 

A prime advantage of the register/index catalog 
is that each register volume is complete as issued 
and its contents need never be merged with those 
of other register volumes. The index volumes are 
cumulated but, as the enti-ies are shorter, they lend 
themselves to compact presentation thereby effect- 
ing an overall savings in publication costs as com- 
pared with conventional book catalogs. The com- 
pactness of the indexes would also facilitate rapid 
scanning of entries. 



Figure bA— Entry elements {and their MARC tags) cov- 
ered in proposed XUC indexes 

XAME : Entries beginning with a personal, corporate, or 
conference heading 

Main : 100, 110, 111 

Added: 700. 710, 711 

Subject: 600, 610, 611 

Series (traced as In note) : 400, 410, 411 

Series added entry : 800, 810, 811 

TITLE : Entries beginning with a title 
Uniform title heading : 130 
Romanized title: 241 
Bibliographic title: 245 
Added: 730, 740 
Subjecc: 630 

Series (traced as in note) : 440 
Series added entry: 840 

SUBJECT : Other than those Included in name and title 
Indexes 

Topic: 650 

Geographic name : 651 
Figure 5.2 — Examples of register and index entries 

Register Entry 
12545* 

Ackoff, Russell Lincohi, 1919- 

Fundamentals of operations research [by] Russell 
L. Ackoff [and] Maurice W. Sasieni. New York, 
Wiley [1968] 

ix, 455 p. illus. 24 cm. 

Includes Ucliographies. 

1. Operations research. I. Sasieni, Maurice W„ Joint 
author. II. Title. 

T57,6.A2 001.4'24 67-27271 

Library- of Congress MARC 

Index Entries 

N'ame Ackoff, Russell Lincoln, 1919- Funda- 
mentals of operations research. 1968. 
ENG T57.6.A2 67-27?71 12315 
DLC lU est RP NjP 

Sasieni, Maurice W., joint author. Fun- 
damentals of operations research, 
[Ackoff, Russell Lincoln, 1919- ] 1968. 
ENG T57.6.A2 67-27271 12345 

Title Fundamentals of operations research. 

[Ackoff, Russell Lincoln, 1919- ] 1968. 
ENG T57.6.A2 67-27271 12345 

Subject OPERATIONS RESEARCH 

Ackoff, Russell Lincoln, 1919- Funda- 
mentals of operations research. 1968. 
ENG T57.6.A2 67-27^71 12345 

♦Note. — ^The hypothetical register number In this example 
is not intended to suggest the actual format of such a 
number. 
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Table 5, 1 — Order of data elements in each type of index entry ^ 



Type of Index entry 



Variable elements 



Invariable elements 



1st 



2d 



3d 



Main name ' Main. name Title' _ _ Imprint date, Language,* LC call number,' 

LC card number,* Register number. 
Added name. Added name Title ^ [Main entry]' As above. 

Titled- Title [Main entry] ' As above. 

Subject * Subject Main entry ' Title ^ As above. 



1 This relates to the function of the entry, not the Index in Which it appears. 

2 A personal, corporate, or conferenv name used as a main entry. 

3 Uniform filing title, romanlzed title, or l)ibliographic title, in that order of 
preference. 

* Marc language code. 

* If present in register record- 
« Bibliographic title. 



' Not relevant for works entered under title. 

« Data elements in title index entries vary considerably depending on the 
kind of title (uniform title heading, title main entry, title added entry, etc.). 
In the Interest of simplicity, this line of the table describes the predominant 
case, entry under the bibliographic title of a work. 

» Name, title, topic, or geographic name as subject. 



Since the National Union Catalog covers the en- 
tire range of current acquisitions cataloged by the 
Library of Congress and the contributing libraries, 
most of the entries it contains are not in machine- 
readable form. The balance will gradually shift as 
funding and other resources permit the expansion 
of the MARC Distribution Service, and it is possi- 
ble also that eventually some of the larger libraries 
may be ablt to report their new holdings directly 
in machine-readable form. Nevertheless, for the 
foreseeable future, a means will be needed to com- 
bine machine-readable records and conventionally 
printed records to produce the National Union 
Catalog. 

The register volume would be made up of three 
types of records, each requiring different treat- 
ment : 

1) A MARC record for a full LC bibliographic 
record converted to machine-readable form as part 
of the MAHC Distribution Service. 

2) An Nt;c report from a contributing library. 
After such a report has been certified to be for a 
new title and the major access points have been 
reconciled with the LC Official Catalog, the record 
would be keyed in full, processed by the format 
recognition programs, proofed, and verified. The 
keying effort is essentially the same tliat re- 
quired to prepare xrc copy in the present manual 
system. 

3) LC printed cards for records outside tlie scope 
of the M.VRC Distribution Service. 



MARC and xrc reports would be used to cre- 
ate part of the register by computer-controlled 
photocoHiposition te<^hniques, LC uou-marc rec- 
ords would be assembled for the second part of the 
register using the same technique norw employed 
for the manually produced bv>ok catalogs. 

The assignment of register numbers presents 
certain difficulties. Machine-readable LC and nuc 
records could be numbered by the computer as 
they entered the system, but conventionally 
printed LC records w^ould liavc to be numbered 
by hand. This would require a separate block of 
numbers for each part of the register and, to avoid 
confusion, the register numbers for the conven- 
tionally printed LC cards should begin with a 
distinctive prefix. 

The indexes would be created as follows: 

1) LC MARC records and nuc reports converted 
to machine-readable form would be processed 
automatically to produce truncated records for the 
desired indexes. 

2) LC non-MARC records w^ould be represented 
by a master index record in machine-readable 
form, which vrould contain all of the data elements 
necessary for automatic generation of the appro- 
priate index entries for each register entry. 
Initially, the miister index record would have to be 
specially produced from a printed card, but it 
should be possible eventually to fill this need by 
adapting the machine-readable record used in the 
projected automated LC Process Information File 
(pif)\ In the latter case, the only additional ef- 
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Table 5.2 — Components of a national union catalog in machine-readable form 



Type of output Retained machine data bases 

Type of Input 



Register Indexes Register » Indexes « 



MARC- Machine-readable Machine-readable Register number Yes, 

and locations 
only 

NUC reports Machine-readable Machine-readable Yes Yes 

LC non>MARC full record Manual Not available Not available Not 

available. 

Master index record Not available Machine-readable Yes Yes. 



* One file arraneed by register number. 

2 One iUe of eacn type of index (name, title, subject). 

' MARC data base is retained elsewhere for other purposes. 



fort would be inputting data elements that wei^e 
not required for pif (e.g., subject entries), pif 
records in nonroman alphabets would probably 
require special handling because the skeleton pif 
record might not provide all of the data elements 
for the master index record. 

Table 5.2 shows the type of inputs and outputs 
of the proposed ntc system as well as the machine- 
readable data bases tliat would bo maintained. 

An NUC System 

A system had been hypothesized to indicate how 
the NUC register /index might be produced from 
machine- i*eadable records. A variety of solutions 
could be postulat<^d, taking into account different 
computers and peripheral devices. The time when 
such a system would be implemented, the expan- 
sion of MARC, and the state of the art of networks 
and regional machine-readable union catalogs 
would influence the design. The xuc system could 
bo a subsystem of the LC system. If on-line ac- 
cess to the national node from regional nodes is in 
being, some or all files would be stored on random 
access devices. On the other hand, the xuc sys- 
tem could be a stand-alone system, utilizing 
MARC records as a source of input but maintain- 
ing its own files* If there were no requirement for 
on-line access by regional systems, it would be 
economical to design a batch processing tape sys- 
tem because of the large volume of data involved 
and the much higher costs of disk storage. 

An xuc stand-alone system would receive 
MARC data through the marc Distribution 
Service in the siime manner as the J/C Card Divi- 
sion does today. The system also would need the 



capability to maintain files of LC non-MARC 
records, xrc reports not in the LC data base, and 
locations. Since it is not part of this study to deter- 
mine an exact method based on a detailed analysis 
resulting in definitive design with associated cost 
estimates, the following methodology should be 
considered as a possible way to assemble data for 
the proposed publication. 

Since various cumulation patterns for publica- 
tion of the indexes and the register of additional 
locations are possible, a hypothetical publication 
schedule has been assumed for the purpose of this 
description. As far as time intervals are concerned, 
the system is open-ended and schedules could be 
modified without any clianges made to the system 
described. 

The assumed pul)licution pattern is as follows : 

1) Monthly indexes at the end of each of the first 
two niontlis of a quarter. 

*2) A quarterly index cumulation (covering the 
last three months) at the end of each of the fii'st 
three quarters of a year. 

I]) An annual index cumulation at the end of each 
of the first four years. 

4) A quinquennial index cumulation at the end of 
the fifth yea/. 

5) An annual list of locations not included in the 
name index. The publication of location informa- 
tion is as follows : 

a) During the first year whenever the main 
index entry for any given record appears (in 
monthly, quarterly cumulations, and annual). 
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all locations received to date will be printed 
Nvitli this entry. 

b) At the end of the second year, a list of loca- 
tions will appear with all :i(l(litional locations 
for records printed in the first year. 

c) All locations for this i-ecord received after 
the second year will be cumulated and appear 
in the quinquennial index main entry. 

d) Additional location reports for older rec- 
ords not in a quinquennial index will be pub- 
lished in a separate list with the quinquemnal. 

The LC M.\RC recoi'ds and \rc reports would be 
used to generate that segment of the register pro- 
duced automatically. At the same time, a unit card 
is produced to be filed for searching reports from 
outside libraries against all xrc entries to identify 
new entries and to post locations. The LC non- 
MARC records are used to produce the manual seg- 
ment of the register and a copy of the record with 
the assigned register number is sent to an input 
section for keying the master index record. For 
the remainder of this section, the machine-readable 
record derived from the LC non-MARC record will 
be referred to as the non-MARC record. It should be 
kept in mind that this record is not a complete 
bibliographic record. LC printed cards represent- 
ing these non-MARC records are also filed for 
searching purposes. Since a large proportion of 
the Nuc reports are for the retrospective entries, 
this search file would be maintained even if all 
current entries were searchable in a machine mode. 

The MARC records are the machine-readable 
Library of Congress bibliographic files. Since LC 
is responsible for the marc Distribution Serv- 
ice, these records are organized in such a manner 
that the date of last transaction ^ is readily avail- 
able for the purpose of distributing new% corrected, 
and deleted records (by status of record code) 
during a prescribed period of time. This same type 
of control on date and status is required for the 
publication of tlie register/index. Therefore, the 
NUC reports and the non-MARC files would have 
to be organized to allow this capability. 

The register is published monthly and register 
entries are never reprinted. Between publications, 
bibliographic records are corrected and/or deleted 
when necessary and new bibliographic records are 
added to the file. In addition, an nuc report can 
be replaced by an LC record (marc and non- 
MARC). Siace a library caa report a holding to a 



published record at any point in time, and the 
library may not be aware that an LC record has 
replaced the reporting library record, a reference 
is made from the mnnbei- of the xuc repoit to the 
LC record in the niachine-re^idablc Hie or in the 
manual Hie, if one is maintained. 

What is being prochiced is an UDdated vereion 
of each machine lile composed of the following: 

1) All records which have required no updating. 

2) The updated form of records to which addi- 
tions or changes have been made. 

3) Records on the file which liave been flagged as 
deleted. 

4) All new records input since the publication of 
the last register/index. 

Any updated record becomes a new entry with 
a new register number and this record is published 
in the next print cycle of the register (including 
LC records which have replaced xrc reports). 
AMien the next cumulative index listing is pub- 
lished, the new register number is associated "with 
the index entries for the original record. There- 
fore, there is no longer any index entry pointing 
to the supplanted register entry. When a biblio- 
graphic record is deleted from the machine-read- 
able data base without being replaced by another 
record, the index entries are deleted from the next 
cumulative index listing. 

Both the m.\rc and non-MARC records have LC 
card numbers and xrc reports are assigned an 
xuc number with similar characteristics. There- 
fore, the LC or xuc card number ^ is used as a two- 
way link between the file of xuc locations and the 
bibliographic files. When a bibliographic record is 
entered in the machine file, a location record is 
generated using the card number and the xuc .sym- 
I)ols for libraries holding that title. The location 
file is organized in such a way that it is possible 
to date any action taken in relation to it. The dele- 
tion of a bibliographic record would automatically 
cause the deletion of the associated location record 
or its transfer to the card number of a substituted 
record. 

Depending upon the requirements for these rec- 
ords beyond printing the xuc, the location file 
may reside on disk or be maintained on tape. The 
index records must be maintained for the produc- 
tion of cumulative index publications and in this 
form the main index entry contains location infor- 
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Table 5.3 — Input and output files in the NUC system 

INPUT OUTPUT 

Bibliographic file Location file Interim file 

1. New record added during time period Original and added locations Added ^ bibliographic record with 

locations appended. 

2. Updated record updated during time period. Original and added locations Updated bibliographic record with 

locations appended, 

3. New or updated record from previous time Locations deleted or added during Bibliographic record with updated 

period not to exceed one j^ear. time period. locations appended. 

4. Deleted bibliographic record during time Deleted bibliographic record. 

period. 

o. New or updated record from so^ie previous Locations deleted or added during Bibliographic record with updated 
time period exceeding one year. time period. locations j.ppended. 

6. > Locations deleted or added during Updated locations. 

time period. 

' The underscore indicates the status given each record for updating the indexes. 



mation, A system could be designed to maintain 
only those location records ^vllich will appear in 
any future print cycle of tlie register of additional 
locations, provided the only requirement of the 
sj^stem is the publication of the register/index 
and the register of additional locations. 

Each month, new and updated maro records and 
Kuc reports for the period are automatically as- 
signed a register number and output^ for publi- 
cation of the register, Nou-marc register entries 
with preassigned register numbers are published 
in a manual mode at the same time. During this 
processing cycle, the following actions are per- 
formed and the resulting records written to an 
interim file : 

1) Each new or updated bibliographic record is 
passed against the location file and the location 
i-ecord appended. 

2) For any bibliographic record containing a delete 
status code, the associated location record will 
have already been deleted from the location file 
and therefore this bibliographic record enters the 
system without an appended location record. 

3) Likewise, each location record residing on the 
location file within the time span for the publi- 
cation of the indexes (i.e., added locations to a 
bibliographic record pre^•iously printed in an in- 
dex) causes the selection of the associated biblio- 
graphic record for reprinting in the next issue of 
the index* Those location records that have no 
corresponding bibliographic records in the ma- 
chine file (because they refer to records in the. 



manual xuc) are selected for eventual publication 
in the list of additional locations. A location record 
for a bibliographical record for a prior year is 
also selected for inclusion in the list of additional 
locations. In these cases, however, the locations are 
appended to the associated bibliographic record 
for later inclusion in the quinquennial index- 

Table 5.3 shows the status of the records con- 
cerned with the bibliographic files and the location 
file at the time the register is produced and as the 
record enters the indexing subsystem. 

The bibliographic records are used to generate 
the name and title indexes and, in the case of LC 
records, the subject index. Locations are appended 
to the applicable index entry; that is, to main 
entries in the name index and to title main entries 
in the title index. Records with locations only are 
carried along for later inclusion in the list of addi- 
tional locations. Each index entry will be w^ritten 
onto its own output tape. Each tape will then be 
sorted on the key ^ appropriate to it, i.e., names, 
titles, and subjects. 

Since the cumulati\*e index files are maintained 
in sort-key order, each updated record for a data 
element that is used as the major filing element 
must have both the incorrect version and the up- 
dated version. The incorrect version is used to find 
the record on the file that will be replaced by the 
updated version of the record. Therefore, to cor- 
rect a record, the system generates from each up- 
dated bi'bliograjAic record, a delete and add record 
combination (two records) ; to delete a record, the 
system generates a delete record only ; and to add 
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a new record, the system generates an add record 
only. 

When a data element that is not used as a filing 
element is to be corrected, only a replace record 
need be genei'ated. In this content, a replace recoi-d 
is a single record that causes a previous record to 
be deleted. However, it may prove simpler for con- 
sistency of software to treat all corrections the 
same. Therefore, in the case where only the loca- 
tions have been affected, the system also generates 
a delete-and-add combination. 

Maintaining the index files in soiled order 
should significantly reduce computer processing 
time since the new index file requiring soiling is 
relatively small, and the merge operation, a far 
simpler and less time-consuming procedure, is exe- 
cuted by incorporating the smaller file into the 
larger cumulative file. 

There are four updates in the machine system : 
monthly, quarterly, yearly, and the final (quin- 
quennial). Each update produces index files and, 
^vhere applicable, the list of additional locations, 
and accumulates the records to be passed along for 
the next higher accumulation period (except for 
the final which only produces the quinquennial in- 
dexes). The update pattern is shown in the follow- 
ing schematic diagram : 

M= monthly index file | 
Q — quarterly index file 
Y= yearly index file 

Numbers = months, quarter, year concerned; zero is 
used for location reports for records prior to Ml; 
that is, those in the manual nuc catalog. 

n=ne\v bibliographic record 

u — updated bibliographic record 

d = deleted bibliographic record 

L~ location record (L is used both for posting new 
locations or deleting locations from biblio- 
graphic records in the machine as well as the 
manual system. In reality, location records 
referring to bibliographic records in the manual 
system would have to be indicated as such.) 
[ ]= records carried in the system for the appropriate 
cumulation period and not printed in the 
month, quarter or year where they are enclosed 
in brackets. 

The numbers always indicate the month in which 
the original new record was entered into the sys- 
tem. Thus, ''u 1-23" means all records from those 
montlis that were updated in the cuzTent month 
(month 24). Once an update or delete action has 
been taken, these transaction records no longer are 
retained in the system 



FiGURK rt^SScheniatic represeniation of machine files 
required for the NUC system 

Ml =nl + [L0] 

M2 =n2 + ul + [dl] + [L0] + [Ll] 
US (Q 1) = n 1-3 + u 1-2 + dl-2 + [L0] + L 1-2 
M4 =n4 + ul-3 + [dl-3] + tL0] + [Ll-3] 
Mr> =n.5 + ul-4 + [dl-3] + [d4] + [L0] + tLl-3]-|-[L4] 
M6 (Q2) = n4-6 + ul-o + [dl-3] + d4-o + [L0] + [Ll-3J + 
L4-0 

M7 =n7 + ul-6 + [dl-6] + [L0] + [Ll-6] 
MS =n8 + ul-7 + [dl-61 + [rl7] + [L01 + [Ll~6] + [L7] 
M 9 (0-^) - n7-9 + u 1-8 + [d 1-6] + d7-8 + [L0 ] + [L 1-6] + 
L7-h 

MlO = nlO-hul-9 + [dl-9] + [L0]+[Ll-9] 

iMli = nll-f iil-iO + [dl-9]-h[dlO]+[L0] + [Ll-9] + [LlO] 

M12(Vl):=nl-12 + ul-ll + dl-U + L0 i + Ll-U 



M24(Y2) = nl3-24-hul-23 + [dl-12] + dl3-23 + [L0] + 
Ll-12 2 + L13-23 



M 36 ( Y3) = n25-36 + u 1-35 + [d 1-24] + d25~35 + [ L0] + 
[Ll-12]+I 13-24 3 + L2o-3o 



Mo9 = no9 + u 1-0 8 + [d 1-57] + [do8] + [L0] + [ Ll-o71-h 
[L58] 

M 60 ( Yo) = n 1-60 + u 1-60 -h d 1-60 + L0 + L 1-60 

1 List of additional locations for manual nuc catalog. 
= List of additional locations for year 1. 
3 List of additional locations for year 2. 

All iiles are sorted prior to publication and /or 
merged into the next higher level accumulation 
in the following descending sort hierarchy: 

Sort key 

LC or yvc card nuinher 
Date 

Delete flag 
Add flag 

This ordering brings together in date order entry 
deletions and additions for a given unique index 
entr3^ (sort key, card number) for the merging 
process. 

Prior to the actual publication of the indexes, it 
would be necessary to pass the files against com- 
puter-based authority files to add the reference 
information to the name and subject indexes. Since 
a record which represents a new title but repeats 
a name or subject used in the last published edition 
can be added to the files, it would also be neces- 
saiy to remove duplicated references from the files 
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and to insure against the inclusion of any blind 
refcrGnces in the index prior to publication. 

Using the nuc figures for the years 1966-1970, 
the number of inde:^ records and cross references 
that would be generated and their record lengths 
were estimated from available statistics at LC as 
well as LC MARC statistics produced by Columbia 
Univei^sity. Assuming magnetic tape with a den- 
sity of 1600 cpi and a blocking factor of 20, the 
quinquennial indexes would require a total of ap- 
proximately 34 tapes (17 for name index, 9 for title 
index, and 8 for subject index). 

Cost Factors 

The maintenance of the National Union Catalog 
and the publication of its holdings entail many 
functions. In the present system all require man- 
power ; in the projected system many of these func- 
tions would be performed all or in part by com- 
puter. Although estimating actual costs for an 
automated iv^uo is beyond the scope of this study, 
it seems worth considering how these cost factors 
would be affected by a change from one mode of 
operation to the other. 

Editorial Cost Factors. 

The effect of automation of the nuc on each of 
the major editorial functions is discussed in the 
following paragraphs. For details about the spe- 
cific duties in each of these functions, see Appendix 
C. 

1) Arranging and Sorting, Hundreds of thou- 
sands of LC cards and nuc reports are assimilated 
into the nuc data base each year. At the outset, 
thej'- are received, recoi'ded, and sorted by hand for 
further processing. In view of the volume of ac- 
tivity and the high proportion of overlapping re- 
ports, there is no reasonable expectation tlmt it 
would be economical to automate this function. 
Therefore, the cost of this function would remain 
essentially the same in the proposed automated 
sj'Stem. 

2) Searching. The searching function consti- 
tutes one of the major cost factors in maintaining 
the NUC, but it does not appear to be one susceptible 
to amelioration by automation. The preponderance 
of NUC reports are duplicates of entries already 
in the system and thus they become merely reports 
of added locations. Although many nuc reports 
of added locations are submitted in the form of 
LC cards, at least as many are not so readily identi- 



fiable. Given the variation in critical data elements 
in many reports, it is difficult to conceive of an 
effective machine searcliing technique tluit would 
not involve excessive keying to secure an exact 
match frequently enough to make the process 
worthwhile. Moreover, even if machine searclnng 
were practical, it would be many yeai^ before the 
machine data base was large enough to satisfy a 
reasonable proportion of the searches. Therefore, 
it cannot be expected that the automation of the 
NUC would have any effect on the cost of this 
function. 

3) Editing. Most of the tasks comprised by t]:3 
editorial function will be unaffected by the \)yo- 
posed automated system. The editing of ncc re- 
ports for new titles constitutes the major work- 
load and there is no likelihood that this task can 
be lightened by the computer until name authority 
information is available in machine-readable form. 
However, some cost reduction will be possible be- 
cause certain editorial work in providing added 
entries and references will be superseded by the 
automatic generation of these entries from ma- 
chine-readable bibliographic and reference control 
records, 

4) Keying. Typing n^jc reports for conversion to 
machine-readable form would involve approxi- 
mately the same effort as typing tliese records in 
the manual system. Typing the master index rec- 
ord for the LC non-arAHC records would entail 
a workload that is not required for the main entry 
in the manual system. On the other hand, typing 
added entries and references would be unnecessary 
for all types of records since these access pointe 
would be generated by the computer. Thus, the 
overall cost of typing should be somewhat lower 
in the automated system. 

5) Proofing. Proofing of added entries and ref- 
erences would no longer be necessary in an auto- 
mated system because these data would have 
already been proofed as part of the creation of the 
original record from which they were generated. 
In the case of LC uou-marc records, it is as- 
sumed tliat the added difficulty of proofing the 
master index record would be offset by the fact 
that proofing separate added entry records would 
be unnecessary. Tlie proofing of the nuc register 
record would become somewhat more difficult be- 
cause it would involve also verifying the accuracy 
of tlie format recognition processing. Across the 
board, however, some reduction in the cost of 
proofing may be anticipated. 



6) Filing. Once the records are in the marc for- 
mat, it is no longer necessary to arrange entries 
by hand for the book catalog indexes and even 
filing the manual control file would be facilitated 
by the machine sorting of new entries prior to 
actual filing. Thus, a substantial reduction in the 
cost of this function could be expected. 

7) Mounting and Stripping. The present manual 
method of mounting printed cards for reproduc- 
tion and then stripping them for later cumulation 
would be unnecessary for any entry in machine- 
readable form. Such an entry would be processed 
by a computer-driven photocomposition device 
and cumulations would be produced by machine 
without regard to the printed form of earlier is- 
suances. Even in the case of LC non-MARC records 
there would be some saving because there would be 
no need to strip the cards for cumulation. 

8) S'f'pervision. Since the cost of supervision tends 
to be a relatively stable percentage of the aggre- 
gate cost of other functions, it may be assumed 
that a reduction in those costs would produce a 
corresponding decrease in the cost of supervision. 
It is not possible to estimate, however, whether 
this reduction would be significant because the 
complexities of the automated system might make 
greater demands for supervisory time. 

Noneditoridl Costs. 

The primary^ noneditorial costs are those in- 
volved in printing and binding the issues of the 
catalog. They are influenced by such factors as: 
1) the number of times an entry must be reprinted 
in the course of various cumulations; 2) the 
amount of information included, in each entry and 
the resulting number of entries that can be fitted 
on a page; and 3) the fonn of the hard copy. 

Under present practice, the full entiy may be 
printed as many as four times : in a monthly issue, 
a quarterly, an annual, and a quinquennial. How- 
ever, because entries with imprints falling outside 
of the current three-year period do not appear in 
monthly issues and because the fourth quarterly 
is not published at the end of the year, the average 
entiy is printed 3.24 times. The register/index 
catalog has the advantage of requiring that the 
full entiy be printed only once. 

Since the indexes are ciunulated, index entries 
do appear more than once in the life cycle of the 
publication. However, the index information con- 
tains only those elements required to facilitate the 



use of the index entry as a stand-alone entry in 
addition to providing a link to the full entry in 
the register. Due to this reduction in the content 
of the entry, many more entries can be printed on 
a page. The number of entries in the present three- 
column format of the nuc is 27 per page; the esti- 
mated number of entries per page in a three- 
column format for the name index is 81. 

The cost of publishing the register, the various 
indexes and the register of additional locations is 
also dependent on tl e forrn of output selected. As 
all information (with the exception of the full non- 
:marc entries) would be in machine-readable form, 
several main alternatives are available : 

1) Graphic arts quality through a photocomposi- 
tion device. 

2) Reduced quality through Computer Output 
Microfilm (com) to lithoplate. 

3) Microform, 

For each option the cost of publication is further 
dependent on the cumulation schedule chosen. In 
the present manual system. Books t Authors is pub- 
lished monthly, quarterly, and annually; Books: 
Subjects is currently published only quarterly and 
annually. In the proposed system it has been as- 
sumed that the name, title, and subject indexes 
would be published monthly, quarterly, and 
annually. 

In the present manual system, noneditorial costs 
(i.e., printing, binding, shipping, etc.) account for 
nearly half of the total kijc cost. Wliile it is be- 
3^ond the scope of this study to identify and evalu- 
ate the many combinations of forms that are 
possible, it is evident that significant savings could 
be effected by using microform for the indexes. A 
conservative approach to this means of cost reduc- 
tion would be to issue monthly indexes in micro- 
fiche and the quarterly, annual, and quinquennial 
cumulations in conventional print form. 

The anticipated effect of automation of the Na- 
tional Union Catalog on the costs of various func- 
tions is summarized in Table 5.4. The significance 
of the variations is difficult assess because the 
various functions do not contribute equally to the 
total cost and the summary of editorial costs does 
not take account of the cost of record control pro- 
cedures that typify complex machine input opera- 
tions. Therefore, although the overall cost of the 
proposed automated system may be less than that . 
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Table 5.4 — Increase or decrease in the cost of producing 
the National Union Catalog by compuler relative to the 
present manual method, by function and type of record ^ 



Function 


LC MARC 


LC -non -MARC 


NUC report 


Regis- 
t<>r 


In- 
dexes ' 


Regis- 
ter 


In- 
dexes 


Regis 
ter 


In- 
dexes 


Editorial: 














Arranging, 














sorting 














Searching 




NA 


NA 


NA 






Editing _ _ . 


NA 




NA 








Keying _ 


NA 




NA 








Proofing 


NA 




NA 








Filing 














Mounting, 














stripping 














Supervision.. 














Koneditorial: 














Printing 














Binding 








+ 






Shipping 















> Relative cost shown by following symbols: -I- (greater than manual cost), 
«« (same as manual cost), — (less than manual cost). 

• Added entries in the manual system are analogous to Index entries In the 
machine system. 

' Not applicable; used w'hen a function Is not necessary In either system. 



of the manual system, it would be more prudent to 
say that it should not exceed that cost, 

A Model NUC Network 

An NUC reporting system could be organized 
on the basis of regional bibliographical centers 
that played an intermediary role by coordinating 
ropoi'ts from their areas and by helping area users 
to obtain desired material. At a Jiighly developed 
stage, siicli centers could be responsible for 
1) fci'utiny, verification, and possible alteration 
of incoming records as an initial step in tlieir 
integration into the xuc file, and 2) referral of 
requesters to material or inter-library borrowing 
and lending operations in response to eitlier search 
re{}uests from within eacli region or queries trans- 
mitted from among tliosf received at the national 
center. Eacli of these functions will be considered 
briefly. 

Most of the tasks connected with integration of 
data into the xuc store could be performed at 
the regional level, subject to the completeness and 
currency of authority records maintained at the 
regional centers and the capability of the centers 
to manipulate data in machine-i^eadable form. 



However, the final reconciliation of incoming 
records with the LC Official Catalog would con- 
tinue to be a task of tlie national agency unless it 
were possible to have the entire LC Official Cata- 
log in on-line mode at locations throughout the 
country. The following functions might be as- 
sumed locally : 

1) Development of subject and or form responsi- 
bilities for specific libraries within each region to 
cliannel reports of locations for particular items, 

2) Coordination and periodic transmission to the 
national center of location reports for items al- 
ready known to be cataloged by IX and in the 
MARC store, 

3j Coordination and periodic transmission to the 
national center of location reports for items al- 
ready known to be cataloged by IX* but not desig- 
nated as being in the marc data store. If records 
for these items have already been encoded in 
machine-readable form at the local level (or if the 
capability to convert manual records exists at the 
regional center), tliey might be converted to the lc 
MARC format by the processes described in 
Chapter 4. 

4) Coordination and periodic transmission to the 
national center of data and locations foi' items not 
already known to be cataloged by IX, In such 
cases it is likely that a division of functions be- 
tween the national and regional 1 centers would be 
desirable. At the least the regional centers would 
coordinate the reporting of locations on the basis 
of assigned responsibilities for coverage and indi- 
cate to the national center whctlier or not catalog- 
ing pi'actices conformed to those of the Library of 
Congress, Further action toward integration into 
the Ni'c Hie might be possible as facilities and 
available data at tlie regional centei^s ex])anded. 

In a network of regional centers it is not clear 
whether locations would best l)e reported in the 
xrc outputs as tliose of s])ecific institutions or 
simply as items held within particular regions. 
The former procedure would allow for the con- 
tiiniation of direct referral of a search request to 
a library holding tlie item, although the requester 
would not necessarily be informed of other loca- 
tions within the region which might be more 
advantageous for referral purposes in given in- 
stances. Althougli reporting in terms of regional 
centei^ would require the center to act as a "mid- 
dleuKin" in the handling of searcli requests, it 
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might allow for greater flexibility and rationality 
in tlie flow of requests for items to be borrowed. 
The choice of reporting schemes would depend on 
what capabilities the regional centers developed 
and what roles they were willing to assume. 

Conclusions 

Automation of the National Union Catalog 
using the register/index form would have the 
following advantages : 

1) Ths range of access points to the bibliographic 
data would be extended to titles and series. 

2) All types of indexes would be cumulated and 
published on the same schedule. 

3) The time required to produce cumulations would 
be significantly reduced. 

i) The cost of the automated system offering these 
•advantages for monthly, quarterly, and annual 
issues would not exceed the cost of the present 
manual system. The cost of producing the quin- 
quennial would be sharply reduced. 

5) The cost of the automated system should grad- 
ually be reduced as more languages are covered by 
the MARC Distribution Service. Further cost reduc- 
tions may be possible as other libraries are able to 
report their holdings in machine-readable form. 



6) Converting nuc reports and master index rec- 
ords for LC nonofARC records to machine-read- 
able form would create a data base that could be 
searched by nonconventional access points (e.g., 
language, imprint date, geographic area). 

7) The xuc data base might eventually form the 
nucleus of an on-line network of regional bibli- 
ographic centers. 

References and Notes 

^ This project is described in Avram, Henrietta D., Le- 
nore S. Maruyama, and John C. Rather. "Automation ac- 
tivities in the Processing Department of the Library of 
Congress." Library Reaouroes and Teohnioal ServioeSt v. 
16, Spring 1972, p. 1G5-289. 

2 The date of last transaction for new records would be 
the same as the date entered on file. For modified or de- 
leted records, the date of last transaction is the date that 
any processing was performed affecting a particular rec- 
ord and therefore the record must again be distributed to 
subscribers. 

3 The register number could be maintained as the link 
to the location index but using the JjC card number al- 
lows direct access to locations when that number is known 
from a source other than one of the Indexes. 

^ Output in this context means formatting the records 
and writing them on magnetic tapes as input for fi photo- 
composition or COM device. 

^ The sort keys wiU be computer generated by a program 
designed to satisfy the filing requirements of the pub- 
lished indexes. 
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Chapter 6 



Alternative Strategies for RECON 



Introduction 

Experience in the recok Pilot Project indicates 
that it would be impractical to undertake the large- 
scale conversion project envisaged in the original 
RECON study. On this scale, such a project would 
demand far more staff, space, and money than 
there is any reasonable prospect of obtaining. A 
retrospective conversion project on a lesser scale 
has the evident disadvantage of being too slow in 
responding to the needs of individual libraries 
aiming toward automation involving total conver- 
sion. It appears to be a fact of life that many li- 
braries are disinclined to postpone local efforts 
until records are available from a central source. 
Therefore, the library community is still faced 
with costly conversion efforts resulting in multiple 
files of nonstandardized data as well as duplica- 
tion in titles converted. 

For these reasons, the hecok Working Task 
Force felt the need to reexamine the premises of 
its original study to determine whether an alterna- 
tive strategy might offer a better prospect of satis- 
fying the need for retrospective conversion. The 
present chapter considers the merits of systematic 
versus nonsystematic conversion as well as aicerna- 
tive wmys in which the records mig'ht be made 
available. 

In attempting to evaluate the advantages and 
disadvantages of various strategies, the Working 
Task Force was constantly faced with the realiza- 
tion that there is no perfect solution to the problem. 
The critical questions of the languages to be cov- 
ered, the dates of the records, the forms of ma- 
terial, the extent of the bibliographic information, 
and the details of the machine format yield widely 
different answers depending on the type and size 
of library involved. Therefore, the best that can 
be hoped for is a compromise on the requirements 
of libraries of various types and sizes. The ensuing 



discussion is an attempt to reach an optimum solu- 
tion to the problem. 

Systematic versus Nonsystematic Approach 

In the context of this discussion, systematic con- 
version means the orderly conversion of existing 
LC records by date and language. This allows a 
potential user to predict w^ith reasonable certainty 
whether a desired record is in the data base. 

The systematic approach to retrospective con- 
version recommended by the reoox Working Task 
Force has the advantage of offering a full 3I.\rc 
record of the quality of the LC Official Catalog 
and a clear definition by date and language of 
records that are in machine-readable form. It is 
obvious, of course, that from the standpoint of any 
given user systematic conversion has the disadvan- 
tage of requiring a long waiting period before all 
relevant records are available. 

Nonsystematic conversion applies to conversion 
of subsets of existing records that are defined by 
less precise criteria ; for example, all records repre- 
sented in a bibliography. In such a case, a potential 
user can determine M'hether the record is available 
in machine-readable form only by checking the 
bibliography in question or by querying the data 
base. The conversion of records from another li- 
brary's data base has this same disadvantage; 
namely, that there is no easy way to tell whether 
a specific record has been converted. 

Systematic conversion of retrospective records 
by year of card series and language can be shown 
to be inadequate even to meet the needs for cur- 
rent acquisitions. An analysis of LC card orders 
for a one-year period shows a remarkable demand 
for older records. T\niile it is true tliat 70 percent 
of the total number of card orders were for titles 
published in the last H years, the fact remains 
that 52 percent of the titles ordered were okkr 
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than 11 years (see Appendix D). The analysis of 
titles ordered once sho\vs a striking consistency 
in the demand for uncommon titles; the percent- 
age of single orders for titles in the latest series is 
siarcoly diflFerent from the corresponding percent- 
age in the oldest series. It niuy be assumed also that 
a substantial proportion (perhaps even the ma- 
jority) of the titles ordered once this year will uot 
be ordered at all next year and that tliey will l)e 
replaced by titles that were inactive this year. Thus 
it seems that, because of the pattern of current ac- 
quisition of retrospective materials in American 
libraries, a substantial body of retrospective rec- 
ords would have to be converted even to meet cur- 
rent demands for machine-readable records. 

An alternative approach to recox would be 
to undertake the conversion of titles ordered more 
than a specified number of times (say, more than 
3) on the assumption that a retrospective title 
being acquired currently by that many libraries 
is likely to be held by many other libraries. Even 
with this appi-oach, however, the number of 
records to be converted would be very large (in 
the specific case, approximately 425,000 records) 
and the coverage of titles needed by ar^y par- 
ticular libiiiry would necessarily be incomplete. 
On the other hand, this approach has the ad- 
vantage of resolving the problem of selecting 
recor<ls that wouI<l satisfy tlie largest number of 
libiaries of various types and sizes. 



Alternative Forms of Conversion 

Regardless of the data base chosen for con- 
version, it is necessary to settle the question of the 
form it will take. The recox feasibility study 
recommended conversion of the full bibliographic 
record to machine-readable form. It would be 
possible alternatively to create macnine-readable 
indexes to the <lata base and to store the full 
records in microform. A variation of this possi- 
bility would involve producing the index records 
and relying on the printed National Tnion Catalog 
fcS the source of the full records. 

The cost of putting the full record in machine- 
readable form varies with the source of the data 
and the extent to which they are made consistent 
with the LC Official Catalog. The rauge is from 
$2.85 for an LC record to $L45 for an outside 
library record for which only the major access 
points have been verified (see Chapter 4). 

The concept of the index entry in lieu of the full 
record entails a basic dilemma. The more data ^j^^^^r^R'hen it was needed. 
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ments included to n^ake the index entry self -suffi- 
<'ient, the more the cost of ci^ating it tends to 
approach the cost of a full record. On the other 
hand, as data elements are eliminated in the inter- 
ests of economy, the index enti-}- becomes progres- 
sively less responsive to various bibliographic 
needs. In the latter case, truncation of the record 
has the effect of severely limiting the library func- 
tions that can be completely automated by using 
the record. For some purposes, the need could be 
met by consulting the full record in another source 
(e.g., microform or book form) but the trade-off 
between economy of machine input and cost of 
human effort in use may be difficult to evaluate. 

It was such considerations as these that con- 
vinced the Working Task Force to recommend in 
itspriginal study the conversion of the ful! biblio- 
graphic record to the marc format, and to confirm 
that conclusion in its study of levels of machine- 
readable records (see Chapter 3). The advantage of 
having a full marc record fornational purposes is 
that, regardless of the intended use, the required 
information is available. 

A factor to be considered in evaluating the mer- 
its of a system involving a machine index to a 
microstore of full bibliographic records is the cost 
of maintaining the microstore. Existing equipment 
for storing large numbers of microimages seems 
always to be expensive, especially when it must be 
capable of providing relatively rapid access to in- 
dividual microimages. Another disadvantage in 
any proposal to use this technique on the national 
level is the procedural complexity of implementing 
it. The problems of which file should be filmed, 
how it would be filmed, and how the index records 
could be efficiently created from the source data 
should not be underestimated. They are in fact the 
same problems that were discussed in connection 
witli the microfilming of records in the recok 
Pilot Project.^ 

In the case of creating an abbreviated machine 
I'ecord and relying on the existing xrc book cata- 
log for the record, the present difficulty in locating 
a particular entry, especially revised entries, 
among the various alphabetic sequences of nuc 
would remain. This disadvantage could be lessened 
by inchiding in the abbreviated record (at addi- 
tional cost) a number for the xrc volume contain- 
ing the full record. Experience in the Riccox Pilot 
Project suggests, however, that the difference in 
cost between an index record and a full record 
would not be sufficient to offset the difficulties (that 
is, the costs to the user) of obtaining the full record 
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Conclusions 

111 the light of the foregoing considerations, the 
REcox Working Task Force feels the large-scale 
retrospective conversion should be undertaken by 
a centralized agency (or component of an agency) 
established expressly for that purpose. This effort 
should not divert the Library of Congress from its 
present objective of going forward as rapidly as 
possible to convert all of its current catalog records 
to machine-readable form. To the extent that retro- 
spective records are required for Library of 
Congress purposes (e.g., Card Division mechaniza- 
tion; special book catalogs), LC would convert 
these records according to its present practices. 
The central agency should have two major 
functions: 

1. It should undertake a program to convert the 
retrospective LC records that are most in demand. 
Initially, the criterion for selection might be those 
records ordered from the LC Card Division more 
than a specified number of times, 

2. It should be responsible for adapting machine- 
readable records from libraries other than LC, The 
scope of this cooperative approach would be modi- 
fied as each new language is covered at LC. 

In developing its program and carrying out 
these tasks, the agency should draw on the experi- 
ence gained in the marc and recon* activities at the 
Library of Congress. Since users will be obtaining 
current catalog records from the Library of Con- 
gress, it is essential that the products of these two 
enterprises be entirely compatible. 



To ensure that the conversion of other libraries' 
machine-readable data bases result in consistent 
records, the following procedures are recom- 
mended : 

1. If a library- converts, it should use tlic best 
available IX record. 

2. If at all possible, the full marc format should 
be used. 

3. The centralized agency should undertake to 
process records to bring them to the full marc 
format (if necessary) and to make the access 
points compatible with the LC Official Catalog 
(see Chapter 4). 

The question of how such an agency could be 
funded is beyond the scope of this study. Since 
the heavy- expenditure involved would have to be 
justified in national terms, it seems reasonable to 
suppose that the operating expenses of the agency 
might come from Federal sources. It is possible, 
however, that foundation funds could be obtained 
to underwrite the costs of planning the organiza- 
tion and supporting it during a test period. The 
investigation of these possibilities might be an 
appropriates task for the National Commission 
for Libraries and Information Science, 

Reference 

Brecon IMlot Project, recox Pilot Project; final report. 
Preimred hy Henriette D, Avrani, Washinjrton, Library' of 
Congress, 197*2, p. 31M3. 
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Apfexdix a 



Problems in Achieving a Cooperatively Produced Machine- 
Readable Bibliographic Data Base 



hy Paul B, Kebabian* 



In assessing the utility of a machine-readable 
bibliographic record, the feasibility study pre- 
pared by the recox Working Task Force in 1969 
stated : 

A prime reason for converting catalog records to machine- 
readable form is to achieve greater flexibility in manipulat- 
ing data. This flexibility will facilitate searching and 
retrieval ; It will lessen the effort of updating records ; 
and It will contribute to production of a wide variety of 
cataloging products (cards, book catalogs, special lists, 
book labels, etc.). Although initially most of tlie applica- 
tions will be along traditional lines, computerization of 
cataloging data should give an added dimension to biblio- 
graphic control that may materially alter familiar patterns 
of use,^ 

In the following remarkri. these a priori assump- 
tions are made: 1) the development of a machine- 
readable bibliographic data base, consisting of 
retrospective library catalog records which can 
be acquired, or to which access can bo made by 
many libraries or groups of libraries, is a desir- 
able objective; and 2) the reasons why such an 
achievement would be of great value to library 
service, as stated in the original rkcox report, 
are essentially valid. 

In essence, the problem of achieving a biblio- 
graphic (hita base hy cooperative means is re- 
lated to the nature of the record. By 'Siature" I 
refer to the cliaractei istics of the record ni terms 
of its constituent elements as defined and pro- 
scribed bv cataloging codes an:! standards of 



•Mr, Kebftbian is director of libraries at ihe Unive-^ity 
of Vermont. He was formerly ass.via^e director of 
libraries at the University of Florida iind chief cataloRer 
at the New York Public Ln)rary, 



practice for the order and content of the catalog 
entry, the subject terminology, and classification. 
The systematic application of codes of principles 
and practi^j, authority lists, and standardized 
classification schedules in preparing a biblio- 
graphic record is desirable if not essential for 
maximum utility and accessibility. This need ob- 
tains whether the end product is a cooperatively 
produced machine-readable catalog record, a tra- 
ditional card form union catalog, or the catalog 
of an individual institution. 

In considering the scope of a project to convert 
reti'ospective bibliographic records to machine- 
readable form, the rec'^x Working Task Force re- 
l)ort proposed that first priority be assigned to 
English language juonographs from 1960 to 1969. 
followed by P^omance and German language mono- 
i^raphs from 1960 to 1969 and English language 
monographs from 1898 to 1950," The question of 
the records to be conveited had aUvOther major 
diuu^nsion, namely the source or souices fi'om 
which the records would be drawn. 

Several existing card form, book form, and 
machine data bases are obvious possibilities. They 
include the National T^nion Catalog, existing re- 
gional union catalogs, the catalogs of a selected 
gionp of major research libraries, the Lilnnry 
of Congress Official Catalog, and the computerized 
catalog recoi'cls of institutions or coiuhinations of 
lib ' *'ies that ha^'e already converted files as part 
iA 'heir automation applications, Tf large-scale 
reti o,speci e con\'eision is a desirable end, then 
the maxin^uiii u ^sid^ratum would seem to he the 
largest mnster file available. The Library of Con- 
'Jrioss cataloged some 4,:! niillion titles in the period 
ISj^S-1969,'' The Xational Union Catalog (xuc) 
cor.hists of an estinuited 11 million titles, including 
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the LC entries. The breadth of coverage of the 
xcc is large in comparison with the holdings of 
the major regional union catalogs. In 1942, it in- 
cluded over 80 percent of their holdings, but none 
of the regional union catalogs had more than 9.2 
percent of the xuc titles.* 

The RECox report took into consideration a 
variety of sources to serve as a possible base for 
catalog record conversion and concluded that the 
most satisfactory base would not be the xuc, but 
rather the LC Official Catalog. For technical 
reasons, however, conversion would begin with 
cards from the LC Card Division "record set,'' a 
file containing a master copy of the latest revised 
reprint of each LC catalog card. The reasons for 
this conclusion are discussed in the recox' report/ 

Perhaps the most realistic and compelling rea- 
son for this choice was the recognition that, if the 
conversion project was to result in a useful product 
offering the potential for a variety of applications, 
the data base should be derived from the source 
offering the greatest consistency and stp.ndardiza- 
tion in its bibliographic information. Although 
there may be no positive evidence in the form of 
studies of consistency of cataloging standards ob- 
served by the Librar;^ of Congress over tlxe years, 
empirical evidence does exist in the IjC card and 
book catalog products. 

At the same time there is evidence that other 
libraries have observed varying local cataloging 
standards. This information is provid'^d by studies 
of changes made in main and added entries, in 
subject headings, notes, and classification on LC 
cards used in other libraries, A study by John 
Dawson® analyzed the kinds of changes made in 
2,679 LC cards by nine major university libraries 
using LC cataloging copy. It revealed that less 
than half of the LC cards used were incorporated 
in catalogs without change. Although main and 
added entry changes were proportionately low, 
libraries using the LC clas,sification changed 15.55 
percent of the numbers. On 15.45 percent of the 
cards, the LC subject headings were either altered, 
supplemented, or not used. 

Other evidence of a lack of consistency is pro- 
vided by a cursory examination of outside library 
entries in any part of the printed xuc. Johannes 
Dewton, writing in 1961 about the draft of the 
cataloging code the., in process of developmeia, 
observed: "1. That under the present Cataloging 
Code there is a consideiable lack of uniformity of 
cataloging . . . especially in the field of corporate 
authorship [and] 2. That uniformity is desirable, 
even needed, in order to exploit to the best advan- 



tage the resources of American I'braries . . . and 
the possibility of machine control of information 
makes this uniformity a fecal point of interest." ^ 
Mr. Dewton was reflecting on a lack of standard- 
ization chii^fly in the area of main and added en- 
tries as they affected the card form and published 
National Union Catalogs. He provided 60 exam- 
ples to illustrate inconsistencies as submitted to 
xrc by '^significant research libraries.'' 

An important effort in the cooperaiive prepara- 
tion of bibliographical data was the LC coopera- 
tive cataloging program which, at its peak 
involved participation of over 150 Am'^rican li- 
brtiries. It was initiated in late 1932 under spon- 
sorship of the American Librar}' Association 
Cooperative Cataloging Committee and the Li- 
brary of Congress with the subsequent assistance 
of a grant from the General Education Board. In 
the initial twelve-year period, 1932-1943, tli-^ Li- 
brary edited and printed catalog cards for some 
96,000 titles submitted by cooperating libraries.® 
Fourteen years later, Dawson stated that "co- 
operative cards make up over one third of the LC 
cards used by research libraries for foreign- 
language titles." ^ 

Difficulties of handling cooperative copy for 
printing of cards were recognized and commented 
on early in the program. In 1934 Charles H. Hast- 
ings, Chief of the Card Division, noted : "The item 
of cooperation with outside organizations that has 
given us most concern and has drawn most heavily 
on the time and energy of the division has been 
the revision and the proofreading of the entries 
supplied by libraries that are cooperating under 
the direction of the A. L. A, Coo^ erative Catalog- 
ing Committee m the cataloging of series and 
books in foreign languages. As anticipated in my 
report for last year, these entries have proved dif- 
ficult to handle because nearly all are in foreign 
languages, and they bring up many unsettled 
points in cataloging, difficult to handle by corre- 
spondence." 

Again in 1941 following he establishment of the 
Cooperative Cataloging Section in the Descriptive 
Cataloging Division at the Library, it was noted 
that . . an attempt was made to bring the co- 
operative cataloging more in harmony with the Li- 
brary of Congress work and to make more use of 
the cards produced in the cataloging of the Li- 
brary's own books. Previously, some of the cata- 
logers at the Library of Congress hcjld such a low 
opinion of the cooperative cards that they often 
ignored them when the book was received in the 
Library, and did the work again." " 
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By 1967, cooperative titles edited for other li* 
braries had dropped to 2,295 titles.^^ 1968 with 
the ''shared cataloging" project initiated under 
provisions of Title IIC of the Higher Education 
Act of 1965 well under way, contribution of co- 
operative copy ceased. With shared cataloging as 
a centralized activity at the Library, the oppor- 
tunity for maximum standardi'3ation of cataloging 
records does exist because the cataloging product 
emanates from a single source. "One of the most 
significant future implications of the present 
[shared cataloging] program is the possibility of 
achieving greater bibliographic compatibility," 
James Skipper remarked in leviewing the proj- 
ect." 

The development of a data store of retrospective 
cataloging records from a numbtr of contributing 
sources comes up squarely against problems of 
standards, uniformity, and compatibility, whether 
the sources be traditional card or book form cata- 
log entries or machine stored data. The reasons for 
the dilemma are not difficult to perceive. 

First, cataloging at any one institution is per- 
formed in relation to the body of cataloging dn^i 
which it has developed through the years of its 
existence and incorporated into its own cataloging 
record. Second, the cataloging product is governed 
by codes specifying guiding principles and rules of 
practice, authority lists of subject terms, classifi- 
cation schedules, standardized lists of names (per- 
sonal, corporate, and <^60graphic), and similar 
criteria of authority. The final record is also in- 
fluenced by human judgment and competence. All 
of the cataloging criteria nave been in an evolu- 
tionary process over the years and are subject to 
future changes. How consiijtently libraries have 
applied codes and other criteria and how exten- 
sively they have modified prior data to reflect 
changes are open questions. The published nuc 
suggests that much inconsistency and few changes 
(other than revision and editing of main and some 
added entries) ha^'e been introduced in outside 
libraries. 

A small sampling of nuc entries provided by 
coi.tributing libraries quickly brings into focus the 
critical problems of compatibility among name en- 
tries, subject headings, and classification. All of 
these elements would be vital for successful ma- 
chine processing of a full bibliographic record for 
the following purposes: 1) to search and produce 
catalog rard sets, 2) to search by topical subject 
terms and personal or cc.rporate names used as 
subjects, 3) to identify records by classification 
number, and 4) to searcli by author and title. Other 



data elements encoded in preparing the record 
might also be used for search and print-out, visual 
display, or other retrieval capability as well as for 
uses only v aguely perceived at this time. 

The Sterling Memorial Library at Yale includes 
a major collection of literature in German and Ro- 
mance languages. In addition, Yale has cataloged 
thousands of dissertations of continental scholars, 
publications which frequently provide vitae for 
author identification. In establishing the author 
names for its catalogs, Yale did so in relation not 
only to established forms of names on LC cards, 
but, perforce, in relation to its own catalog which 
included many more similar names. Inevitably the 
same surname has often been identified in either a 
briefer or fuller form, with or without dates, when 
one compares LC a' ' Yale forms for the same in- 
dividual. Neither caxx be said to be incorrect, yet 
they differ because they were established at differ- 
ent times to be compatible with different catalogs. 

A card representing Laws Relating to the Prac- 
tice of Dentistry and Dental Hygiene published by 
the Texas State Board of Dental Examiners and 
cataloged by the New York Public Library pre- 
sents a number of variations from the cataloging 
data which the Library of Congress or many other 
libraries would provide. The nypl main entry is 
"Texas. Statutes" while the LC form is "Texas, 
Law, Statutes, etc»" Passing ov°r variations from 
Anglo-American Cataloging Rides in capitaliza- 
tion and paragraphing in the body of the entr\', 
one finds that the nypl subject heading is "Dentis- 
try— Ju* isp.— U.S. — Texas" while the LC form is 
not only quite different but provides for direct 
rather than indirect subdivision. The added entry 
from N^ypL is "Texas. Dental examiners. Board of 
because d'^cument headings in its catalog have been 
established in an inverted form. The title is not 
classified but bears a unique, alpha-numeric num- 
ber showing a fixed-order location. The difficulties 
in attempting to convert such entries to form and 
substance compatible with those of LC are obvious. 
The NYPL subject headings still retain in some sub- 
stantia) measure the early structure of an alpha- 
betico-classed system. The Library of Congress 
uses "Malay Languages" as a'subject heading with 
see-also references to some 55 related lano^uages 
and dialects including "Tagalog." The kvpl form 
for "Tagalog" is "Malay language— Dialects; 
Tagala." These examples are not isolated excep- 
tions in the entire body of cataloging contributed 
to the NUC in card form, but are representative of 
variation^; in u significant portion of the file. 
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A variety of classification schemes are repre- 
sented in the kuc : LC, local adaptations of LC, 
Dewey, Dewey v. ith changes and with numbers 
derived from many successive editions, and a host 
of other schemes. This latter category includes 
many locally developed or locally derived systems 
such as those of Yale, Harvard, xypl, and many 
piecial libraries, as well as numbers for fixed- 
order systems and items with no classification at 
all. Many special libraries with significant hold- 
'ngs, such as Union Theological Seminary, the 
Xational Library of Medicine, and the Xational 
Agricultural Library, have their individual clas- 
siftcption and subject heading systems. Together 
with major public and university librarier. includ- 
ing some of the largest contributors to the n'UC 
card record, they have provided catalog records 
over the years which are often seriously incompat- 
ible with the data of other libraries and with LC 
cataloging. Again, it should be noted that the data 
are not incorrect but different. 

Approximately 2.5 million catalog records (a 
gross sum, not adjusted for duplicates) have been 
converted to machine-readable form by 22 librar- 
ies." OflPhand, they seem to offer an inviting source 
of records for a national data base. But these li- 
braries have encoded records that represent their 
individual cataloging experience and history. Al- 
though there may be a relatively hi^rh consistency 
within the data base of any one library or net- 
work, the records taken as a whole are unlikely to 
provide more than accidental consistency in terms 
of the enti-y forms, subject termlnolog}*, etr. They 
also represent differing levels of data, running 
from brief identification for purposes of automated 
circulation conti'ol to full bibliographic records 
compatible with marc. Therefore, it is apparent 
that a majoi- editing and recataloging effort would 
be required to assimilate them into a uniform data 
base. 

The conclusion seems inescapable that the most 
useful machine- readable bibliographic data base 
must be one derived from a single major source, as 
is the current base being developed in the marc 
program. It should be a source that offers a rela- 
tively high degree of consistency in the application 



of cataloging standards, one which reflects a full 
rather than a partial record, and one that has 
historically incorporated changes and is still 
hospitable to future change and updating. This 
confiims that the conversion of retrospective cata- 
log records should, insofar as possible, be based 
on the LC Official Catalog record, Nevertheless, 
we need also to pursue solutions to the problem 
of how to expand and enhance the retrospective 
data base beyond the initial scope of the LC Official 
Catalog in order to incorporate the millions of 
titles not held by the Library of Congress. Co- 
operative funding, rather than cooperative prep- 
aration, may well be the route to follow. 
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Appendix B 



The National Union Catalog — Its Characteristics and Activity 



This paper describes briefly the major charac- 
teristics of the National Union Catalog (nuc) 
maintained by the Library of Congress and gives 
basic data about the level of reporting by Ameri- 
can libraries. This information may be helpful in 
analyzing some of the problems that must be faced 
in planning for a national bibliographic store in 
machine-readable form. 

NUC as an entity is a file of catalog records for 
works held by American libraries. In general, each 
distinct record is represented by a single entry 
under author or title but added entry references 
are included in the newer part. Since xuc reports 
are subjected to only minimal editing, the same 
bibliographical item may be represented by more 
than one entry filed under different headings in 
widely dispersed portions of the file. 

NUC is divided into two main components: the 
older part covers imprints through December 31, 
1955 ; the new part covers 1956 and later imprints. 
When the nuc Publication Project began in 1967, 
it was estimate^ that the pre-1956 part contained 
16-18 million cards. The proportion of duplicate 
entries was known to be high, however, and it has 
been confirmed in the process of editing and pub- 
lishing volumes for the entries under A and ]R. It 
is probable that the true size of this part of svc is 
closer to lO million cards. The post-lOoG file con- 
tains about 3.75 million cards (including refer- 
ences and added entries) for iteir s not yet 
represented in a quinquennial book catalog. 

Besponpibility for reporting to svc is assigned 
on a regional basis. An effort is made to have at 
^east two libraries in each region report compre- 
hensively; the others, selectively. Criteria for full 
reporting of 1956+ imprints and selective report- 
ing are set forth in Addendum 1. The imit for 
reporting is "card" represented by T^C printed 
cards, card order slips, or skeleton entries for items 
represented by LC cards. Titles not represented by 



LC printed cards are supposed to be reported in 
full cataloging form. 

It is difficult to estimate the number of libraries 
represented in the nuc. The libraries listed in 
Symbols of American Lihi'aries ^ are not a true 
indication of contributors because that publication 
provides symbols for many institutions that have 
not yet sent cards to nuc. A current estimate by the 
Chief of the Union Catalog Division places the 
number in the vicinity of 1,000. This figure takes 
as its base a 1962 statement that 763 libraries had 
reported their holdings up to that time.^ The num- 
ber of active contributors is much smaller, amount- 
ing to 328 libraries in fiscal 1969.^ It should be 
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taken into consideration, however, that receipts 
from such sources as the Union Library Catalogue 
of the Philadelphia Metropolitan Area and the 
Cleveland Regional Union Catalog comprise titles 
from a number of libraries. 

The level of reporting naturally varies consider- 
ably from library to library. Apart from the Li- 
brary of Congress, the lo rgest contrioutor reported 
nearly 136,000 titles and several contributoi^ re- 
ported only one title. Table B.l shows the distribu- 
tion of active libraries by number of reports and 
date of coverage for fiscal 1969. 
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Addendum 1 

Criteria for Full Reporting of 1956 Imprints 
to The National Union Catalog Approved by 
the A.L.A. Board on Resources Committee on 
the National Union Catalog, Chicago, Jan. 30, 
1957 

To assure that the printed National Union 
Catalog will be developed to its full potentialities 
(i.e. to contain entries for all titles of 1956 im- 
])rints acquired by American libraries and to 
record approximately twenty location? of such 
works ^ecjrraphically dispersed throughout the 
T\S.A.) the A.L.A. Board on Resources has recom- 
mended that a relatively small number of im- 
portant libraries in strat^pic peopraphical loca- 
tions undertake "full" reporting and that 
lumdmls of smaller, or special libraries provide 
"selective" reporting to The National Virtov 
Catalog. The following criteria for "full*' repoi-t- 
in^ were approved bv the Board on Januarv J^O, 
1957. 

The word "full'* is not to be interpreted as 
"complete" or "entire", since there an* certain cate- 
gories for which cards would be superfluous. Thus, 
when a library is asked to report "fully" it should 
report all 1956 imprints, includinfr those that are 
represented by LC printed cards, with the follow- 
ing exceptions : 

Reprints 
Serials 

United Nations Publications 

Titles for which **cdp" copj is requested by Card 
Division 



Official state pubUcations 

(except the one library in each state designated 
to report) 
T'.S. Government Publications 

(except analytics in series not analyzed on LC 
cards) 

Of course, those libraries that duplicate all of 
their cards and find it ex])edient to send copies of 
all such cards may continue to do so — unnecessary 
cards will be discarded by the I'nion Catalog Di- 
vision. However, if selection by the cooperating 
library will prove advanta^'eous, cards for the 
above indicated categories of materials may be 
withheld with a resulting saving of labor at the 
National Union Catalog. 

Re sure that the ])roi)er symbol for your library 
is affixed to each entry. Libraries that duplicate 
their own cards should add an asterisk to their 
library symbol when such cards are produced from 
unaltered LC card texts. This will expedite the 
handling of entries by the Editorial Staff. Cards 
.should be sent to the Union Catalog Di\nsion, 
Library of Congress, Washington D.C. 20540. Yel- 
low mailing labels are available on request. 

The suggested categories of exclusion will be 
applicable to the general run of cataloged ma- 
terials. Ifowever, it is expected that on occasion 
catalogers will recognize exceptional titles within 
these categories which should be reported because 
of their rarity, unusual research value, etc. 

Items represented by LC printed cards may be 
reported in any of the following simplified forms : 

Send yellow card order slips that are returned by the 
Canl Division with filled card orders. These slips .should 
be stamped **Por NUC from 

Send a skeleton entry which may be limited to fuU au- 
thor entry, first few words of title, imprint date, IX" 
card number, and your library symbol. 

Send a copy of the LC card on which the symbol for 
your library is affixed. 

Revised Criteria for Selective Reporting of 1956 
Imprints to the National Union Catalog Ap- 
proved by the A.L.A. Board on Resources Com- 
mittee on the National Union Catalog, Chicago, 
January 30, 1957 

These criteria are devised to make certain that 
at least one eopy of every title of potential re.search 
value I)ubiisli0f1 in 1056 and later is recorded in 
The National Uniop Catalog and at the same time 
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prevent a flood of reports of widely held common- 
place books beyond the ciipacity of tlie editorial 
staff to handle. The immediate objective is to pro- 
vide a published national catalog of monographs 
of 1056 imprints through The Xational Vnion 
Catalog and of serial pu!>lications, commencing 
witii 1950, through Neit Serial Titlen, Botii |)u!)- 
licntions attempt to locate titles in libraries at 
various geographical points throughout the I'.S- 
and (\inada so that the interlibrary loan burden 
will l)e spread more equitably and that borrowing 
libraries will have a reasonable chance of finding 
desired items in a neighboring institution. 

The following are general critiM-ia intended for 
the guidance of libranes asked to report only a 
selection of their 195(> imprints. Titles falling 
witiiin the criteria are to be reported eveji irheu 
IJ' printed eards are ar ail able. 

What To Report 

Monograph'^ (including monographs in series) 

1. All books published outside the [\S. including 
titles in all alphabets and publications of foreign 
governments. 

2. Items not in the book trade published in your 
region and/or within your sphere of acquisition. 

8. Publications of the state government of the 
state in which your library is located unless an- 
other library in your state is reporting such ma- 
terials ( but not of other states ) . 

4. In addition to the above broad categories, cards 
should be sent to the xuc for : 

a) All titles for which no LC cards are 
available. 

b) Imprints of rare or unusual character, or 
which are considered collectors' items. 

c) Analytics of monographs in series (inelud- 
ing I^.S. Government publications) when not 
analyzed by I/^ cards. 

5. Revised entries for works previously reported 
should be clearly designated as such and should 
indicate previous form of entry when main head- 
ing has l)een ehanged. 

Se7^ials 

New Serial Titles, the serials counterpart of The 
National Union Catalog, lists titles and holdings 
of serials whose first number was issued January 



1, lU.V) and later. Such entries will not be published 
in the si r. 

Borderline publications which might be cataloged 
as either monographs or s >rials may be repoited 
to Thr \a/ion/fl Vuion ('at(dog whicii will eitiier 
publisii tiie entrv or forward it to Ne\r Serial 
Title>i, 

Libraries not now reporting to NST are urged to 
secure report forms and instructions from The 
Editor, New Serial Titles^ Library of Congress, 
Wasliington 25, D.C. 

Note: (^atalog cards should be sent to the Tnion 
Catalog Division, Library of Congress, Washing- 
ton 25, Yellow mailing labels are available 
on request 

How To Report 

All reports to the xcc should be identified with 
the proper library symbol. 

Titles not represented by IX" printed cards should 
be reported in full cataloging form, including 
added and subject entries. 

Items that are represented by IX" printed cards 
may be reported in any of the following simplified 
forms: 

Send yeUow card order sUps that have been returned 
to you by the Card Division with filled card orders. Such 
Hlips should be stamped "For NUC from Or. 

Send skeleton entries giving full main heading, first few 
\vord.s of title, imprint d.*ite. LC card number* and your 
library symbol. Or, 

Send copies of LC printed cards on which your symbol is 
affixed 

Lii»raries that do not use IX" cards are urged to add an 
asterisk to their library identification symbol when such 
cards are produced from unaltered IvC card texts. This 
practice will exi>edite the handling of such entries by the 
Editorial Staff. 

The National Union Catalog 
General Information 

The National Union Catalog (nuc) is a record 
of publications and their location in the Library 
of Congress and more than 1,100 other libraries 
in the United States and Canada. As such it is the 
central register of library resources in North 
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America. Major portions of the xrc are published 
on a continuing l)asi.s as detailed in para^rraphs o 
and 4, hut the hulk of the record for iini)i-iuts prior 
to 1950 is contained in card tiles. Tl) is ni t on cards 
is housed principally in the Lihrarv's Main Build- 
ing. Kooin MH-14()-A. Tiitil its abolition in July 
1970 the Union Catalog Division e:icrcis(»d most 
NL*c functions, including' liaison with the pnbli(\ 
but the various activities relating to the xi c an* 
currently distributed anion^ several Library di- 
visions. The foUoAvin^ .statement sunmiarizes pres- 
ent arrangements. Further information coiu*eruin^ 
any of the following services or publications is 
available upon request to the appi'opi-iate address. 

1. Reference Service 

Reference service on book locations and biblio- 
graphic information recorded in the xuc (pub- 
lished and unpublished) and in various auxiliary 
union catalogs in oriental and Slavic languages is 
the responsibility of the Union Catalog and Inter- 
national Organizations Reference Section, General 
Reference and Bibliography Division. The office of 
Robert W. Schaaf, Head of the Section, and John 
W. KimbalK Assistant Head, is located in 'MB- 
144 Balcony (phone 202-i26~o534). The Union 
Catalog Reference Unit, Mrs. Dorotliy Keamey, 
Supervisor, is in the adjacent Room MB-140-A 
which houses most of the nuc card files for im- 
prints prior to 1952. As part of its service the Unit 
prepares and circulates to about 75 researcii li- 
braries the Wrekli/ List of J'l^Iocated Rrsnirch 
Books, The telephone number for reference in- 
quiries is 202-42()-6300. Written requests should be 
addressed: Library of Congress, Union Catalog 
Reference Unit, Wasliington, D.C. 20540. 

2. Submission of Reports to NUC 

Matters concerning reports to the xrc (i.e., the 
transmission of catalog cards of any imprint date 
by libraries to the Kuc), and replies to inquiries 
concerning reporting criteria are the responsibility 
of the Catalog Publication Division, Mr.s. Glorir 
Hsia, Chief. This division is located in the Massa- 
chusetts Avenue Annex, 214 Massachusetts Ave- 
nue XE. The address is Library of Congress, 
Catalog Publication Division, Washington, D.C. 
20540. 

3. Catalog of Post-1955 Imprints and Special- 
ized Publications 

The National Union Catalog^ a Cumulative An- 
thorList is published in monthly issues with quar- 



terly, annual, and quinquennial cumulations. It 
includes titles currently cataloged by the Library 
of Congress on printed cards and nionogra])hic 
titles for 195() and later years tluit are reported by 
major U.S. rescan^h libraries and some Camidian 
libraries. This C<it<iJo(j \< supplemented by the 
/u r/Av/f r of Aihr/tional Lorntions. (^ther sj)ocial- 
ized i)ubli('ations are : 

S}/77}boJs of Ameri<^an Libraries (Pearlier editions 
entitled: Symbols Used n) the XatJoiial Unton 
CataJog of the Library of Congress) 

Requests for symbols foi* additional libraries to 
be included and jjotices of clianges of name, (^tc, 
of libraries already included, should be addressed 
to Library of Congress, Catalog Publication Divi- 
sion, Editor, Symbols of American Libraries, 
Washington, D.C. 20540. 

Xational Register of Microform Master's 

Reports of locations of microform masters (i.e.. 
microforms used only to make other copies) should 
be addressed to Library of Congress, Catalog Pub- 
lication Division. Editoi*, National Register of 
Microform ^Masters, Washington, D.C. 20540. 

Ney:spapers on Microfilm 

Reports of microfilms of American and foreign 
newspapers should be addressed to Library of 
Congress, Catalog Publication Division, P^litor, 
Newspapers on Microfilm, Washington, D.C. 
20540. 

Mfcrofilming Clearing House Bulletin 

This is issued at irregular intervals, us reports 
are received, and appears as a supplement to the 
Library of Congress Information Bulletin. Re- 
ports of major microfilming projects, planiu'd or 
completed, should be made to Library of Congress, 
Catalog Publication Division, Microfilming Clear- 
ing House, Washington, D.C. 20540. In addition 
to general reports of projects to MCH, reports of 
the individual titles filmed as part of the project 
should also be made to the editor of the pertinent 
Library of Congress catalog. 

The National Register of Microform MaMei'H is 
edited by Harold Cumbo (202-426-5980). Neu\H- 
papers on Microfilm, the Microfilming Clearing 
House Bulletin^ and Symbols of American Libra- 
ries are edited by Imre Jarmy (202-426-5959). 
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4. National Union Catalog, Pre-1956 Imprints formation/Publishing Ltd. Over 100 of a pro- 
jected 610 volumes have been issued, and the cn- 
The National Union Catalog Publication Proj- tire project is expected to take about 10 years. Staff 
ect, Johannes Dewton, Head, is responsible for of the project, which is not chared with respon- 
editing the Natioruil Union Catalog^ Pre-WSC Im- sibility for service to the public, is located in MB- 
printSy which is being published by Mansell In- 137. 
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Appendix C 



Major Duties Involved in the Preparation of the 
Library of Congress Book Catalogs 



The following list summarizes the major duties 
involved in the manual preparation of the Library 
of Congress book catalogs : 

A. Arranging and sorting 

For NatioTud Union Catalog and Register of 
Additional Locations 

1. Receiving, recording, and sorting of 
outside library reports. 

2. Receiving, recording, and sorting of 
LC printed cards. 

3. Recording and sorting of typed print- 
ing file cards. 

4. Sorting of various other cards such as 
cancellation notices, entry revision 
iiotices, etc. 

5. Arranging all of the above, some nu- 
merically, for processing or filing. 

For Books: Subjects 

1. Sorting of IjC printed cards, typed 
subject heading and reference cards, 
also cards for various in-process or 
auxiliary files. 

2. Arranging these for processing or 
filing. 

For other catalogs 

Each catalog has its own array of print 
files, auxiliary files and in-process files 
for which cards may be sorted, recorded, 
or arranged- 

B. Filing 

ForNVG 

1. Filing into Control File. 

2. Filing into several print files. 

3. Filing into various in-process or aux- 
iliary files. 
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For Books: Subjects 

1. Filing into subject authority file. 

2. Filing into several print files. 

3. P'iling into various in-process or aux- 
iliary files. 

For other catalogs 

1. Filing into several print files. 

2. Filing into various in-process and aux- 
iliary files. 

C. Searching 

For NUC and Register 

1. Searching in Control File and 1958- 

1962 printed book catalog. 

2. If foand, add symbol to Control File 
card and forward report; current to 
NUC Author List^ non-current to 
Register. 

3. If not found, but heading is found, 
refer to be edited. 

4. If not found and heading lacking, 
either a) if modern personal author 
heading, refer to be edited; or 'b) if 
corporate author or older personal 
name heading, refer to Official Catalog 
for additional searching. 

5. Searching in Official C/atalog, when re- 
quired, for established form of heading 
for corporate authors and older per- 
sonal names. 

6. Searching and matching in nuc print 
files to add current locations. 

7. For the Register, searching in the nuc 

1963 annual to add card number to 1963 
outside library reports. 

8. Searching conflicts in Official Catalog, 
the Control File, the book catalogs, or 
various Card Division files. 



For NRMM 

Searching for LC card numbers and es- 
tablished headings in Official Catalog, nuc: 
Pre-lOoG Impi^nts^ the Main Catalog, etc. 

For other catalogs 

Various searching to determine status of 
particular entry or heading, to solve con- 
flicts, to establish headings. 

D. Editing 

For NUC {Outside Library Reports) 

1. Verifying choice and form of heading. 
Establish heading if new. 

2. Verifying general correctness of cata- 
loging. 

3. Providing for requisite added entries 
and cross references. 

For NUC {Printing files and j or Control File) 

1. Providing for requisite added entries 
and cross references for LC cards. 

2. Providing for requisite information 
cards : history cardSj name prefix cards, 
acronym cards, etc. 

3. Ilevie%ving of print files and final page 
copy. 

4. Solving conflicts^ correcting errors, up- 
dating entries, making corrections and 
changes. Coordinating related aifected 
entries in sariie file. Coordinating 
changes between the printing files and 
the Control File and between Catalog 
Publication Division and the descrip- 
tive cataloging divisions. 

For Register 

1. Preparing brief author-title entries for 
added locations to outside library re- 
ports in the nuc 1968-1962 issue. 



2. Preparing controls and references for 
cancelled or superseded card numbers. 

3. Reviewing of print files and final page 
copy. 

4. Solving conflicts, correcting errors, up- 
dating entries, making corrections and 
changes. Coordinating changes between 
the Register files, the nuc print files, 
and the Control File. 

For other catalogs 

Each catalog has its own editing require- 
ments based on catalog content, entry for- 
mat, etc. 

E. Typing and proofreading 

For NUC 

1. Typing of printing file cards for out- 
side library reports. 

2. Typing of printing file added entries 
and cross references for LC printed 
main entries. 

3. Proofreading of typed cards. (Typed 
added entries and cross references for 
outside reports are also Xeroxed for 
use in Control File) , 

For other catalogs 
As required. 

F. Composing of page copy (Mounting and 
stripping) 

For all catalogs (except Symbols of Ameri- 
can Libraries) 

1. Preparirvg camera copy by shin- 
gling and taping cards onto 14^/^ by 
20 inch cardboards. 

2. Numbering and collating pages. 

3. Dismantling of camera copy used in 
past issues so that cards can be re-used 
in the next larger cumulation. 
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Analysis of Library of Congress Card Orders 
(April 1970-March 1971) 



The Card Division provided a magnetic tape 
listing aU LC titles ordered in a one-year period 
and the frequency of order. From April 1, 1970, 
to March 31, 1971, 11,896,521 orders were received 
for 1,209,198 titles. Tables were made to summarize 
the entire tape and two subsets of the tape. The first 
group was a random sample of l,7lO titles selexjted 
to study the relationship of language to card or- 
ders. The second group comprised the 1,000 most 
frequently ordered cards. It should be remembered 
that this analysis doa^ not include the over 1,000 
subscribers to complete sets of LC proofsheets or 
the 84 research libraries who receive depository 
sets of all currently printed LC cards. If these 
libraries had ordered cards instead, the number 
of cards ordered from the 7 series would probably 
bo substantially larger. 

The following comments point out any unusual 
characteristics of each table. Because some of the 
counts were made by computer and some by hand 
or estimation, the tables have certain small dis- 
crepancies. Figures have been rounded to indicate 
the approximations. Because the 7 ^eries began in 
December 1968, the 7 series includes both 1969 and 
1970 printed cards. It was decided to ignore the 
number of cards printed between January and 
March 1971 as being too recent to have been ordered 
by outside libraries. 

Tables D.1-3 and Figure DA present some 
characteristics of the total tape. Table D.l shows 
that 42 percent of the 7-series cards printed were 
ordered. Actually the demand for current LC 
cards was appreciably higher as reflected by the 
distribution of proofsheets and depository sets. 
The number of cards ordered once (as shown in 
Tables D.2 and D.3) differs by 0.25 percent. The 
almost constant level of cards ordered one time 
is shown in Table D.3. Figure D.l shows the close 
relationship of cards printed (top line) to cards 
ordered (bottom line). 



Figure D.l— Number of LC cards printed by card series in com- 
parison to number of LC cards ordered (April 1970-March 
1971) by card series 
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Table?; D.4-f> are based on a random sample of 
1,710 ards wiiich was drawn from the entire 
listing of LC card orders. The fact that the total 
percentages in Tables D,5 and D.6 are remarkably 
similar to those in Tables D.l and D.2 confirms 
that the sample is representative of the total, A 
rough count showed that less than two percent 
of the cards were not monographs; almost all 
the serials were in English, Because this analysis 
covers only card orders and does not include the 
use of proofsheets, depository sets or book catalogs, 
the high percentage of English titles ordered (77 
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Table D.l — Number and pereenlage of all tilles ordered from April 1970 through March 1971, by period of series 





Number or titles 
Period o( aeiies ordered < 


Percantamof 
(oUl Ulks 
ordered 


ToUJ namber of 
titles available > 


orderpd 


Average 
number of 
orders ' 




3, 400 


0. 3 


20, 500 


16. 6 


2. 5 


1900-09 


- - 69, 700 


5. 8 


394, 400 


17. 7 


2. 2 


1910-19 


69, 000 


5. 7 


406, 200 


17. 0 


2. 2 


1920-29 


68, 700 


5. 7 


329, 900 


20. S 


2. 7 


1930-39 


- - - 99, 500 


& 2 


468, 200 


21. H 


3. 0 


1940-49.. _ 


___ 116, 100 


9. 6 


592, 700 


19. 6 


3. 6 


1950-.^9 . 


203, 500 


la 8 


896, 200 


22. 7 


5. 7 


1960-68 


405, 200 


33. 5 


1, 029, 100 


39. 4 


11. 5 




174, 100 


14. 4 


416, 900 


41. 8 


27. 9 



All series 1, 209, 200 100. 0 4, 554, 100 26. 6 O H 



> Data rounded to nearest hundred, 
s Calculated from unrounded data. 
1 Includes 1909 and 1970 cards. 



Table D.2 — Number and percentage of tilles ordered 
between April J 970 through March 1971^ by frequency of 
orders 



Table D.3 — Number and percentage of titles ordered 
once from April 1970 through March 1971 ^ by period 
of series 



Frequency of orders 


Number of 
Ulles > 


Percentage 


Cumulative 

Number of Percentage 
tlUes 


2,000 or more... 


» 20 




20 


C) 


1,000 to 1,999.__ 


no 


0) 


130 


0.01 


900 to 999 


40 


C) 


170 


.01 


800 to 899 _ 


60 




230 


.02 


,"^0 to 799 


100 


0. 01 


330 


.03 


600 to 699 


180 


. 01 


510 


.04 


500 to 599 


310 


. 03 


820 


.07 


400 to 499 


620 


. 05 


1.440 


. 12 


300 to 399_ 


1,400 


. 12 


2, 840 


. 23 


200 to 299 


3, 500 


. 29 


6, 340 


.52 




12,800 


1. 06 


19, 140 


1. 58 


90 to 99 _ 


3,000 


. 25 


22, 140 


1. 83 


80 to 89 


4,200 


. 35 


26, 340 


2. 18 


70 to 79. _ _ 


5,000 


. 41 


31,340 


2.59 


60 to 69 


7, 300 


. 60 


38, 640 


3. 20 


50 to 59 


10, 100 


. 84 


48, 740 


4. 03 


40 to 49 


13, 800 


1. 14 


62,540 


5. 17 


30 to 39 


22, 300 


1. 84 


84,840 


7. 02 


20 to 29 


38, 700 


3. 20 


123, 540 


10. 22 


10 to 19 


92, 900 


7. 68 


216, 440 


17. 90 


9 


18, 500 


1. 53 


234,940 


19. 43 


H 


22,500 


1. 86 


257, 440 


21. 29 


7 


27, 100 


2. 24 


284,540 


23. 53 


6 


35, 600 


2. 94 


320, 140 


26. 47 


5 


46,000 


3. 80 


366, 140 


30. 28 


4 


62, 500 


5. 17 


428,640 


35. 45 


3 


99, 100 


8. 20 


527, 740 


43. 64 


2 


195, 100 


16 13 


722, 840 


59. 78 


1 


486,400 


40. 22 


1, 209, 240 


100.00 



I Figures above 1 ,000 rounded to tens; thoee below 1 ,000 rounded to hundreds, 
s The largest number of orders for a title was 3,280. 
• Less than 0.01 percent. 



Number of Percentage Total 



Period of series titles of total numl)er of Percentage 

ordered i titles titles ordered 

ordered available i 



Pre-1900. 2,000 0.4 20,500 9.8 

1900-09 41, 100 8. 4 394, 400 10. 4 

1910-19 41, 000 8. 4 406, 200 10. 1 

1920-29 36, 800 7. 6 329, 900 11.2 

1930-39....... 52,000 lo. 7 468,200 11.1 

1940-49 58, 700 1 2. 0 592, 700 9. 9 

1950-59 82, 400 16. 9 896, 200 9. 2 

1960-68 -. 124, 400 25. 5 1, 029, 100 12. 1 

7 series* 49,200 10. 1 416,900 11.8 



AU series.. 487, 600 100. 0 4, 554, 100 10. 7 



1 Data rounded to nearest hundred. 
> Includes 1909 and 1970 cards. 

Table D.4 — Number and percentage of cards in a sample 
of LC card orders^ by language 



Percentage 

Language(8) Number Percentage of current 

LC 
ci^taloging 





1, 320 


7/ 


French/ German 


170 


10 


I tali an /Spanish/ Por t u gucse/ 






Romanian 


92 


5 


Dutch/Scandinavian 


13 


1 




55 


3 


Other roman__ 


28 


2 


Other nonroman 


32 


2 



Total 1, 710 100 100 



45 



Table D. 5 — dumber and percentage of English and non-English cards in a sample of LC card orders ^ hy period of series 



En^f^sh titles Non-English titles Total 

Period ol series - 

Number Percentage Number Percentajje Number Percentage 



Pre-1900 3 100. 0 3 0. 2 

1900-09.--- 83 85.6 H 14.4 97 5.7 

1910-19- 87 85.3 i5 14.7 102 6.0 

1920-29- 87 85.3 15 14.7 102 6.0 

1930-39- 113 77.4 33 22.6 146 8.5 

1940-49_ 122 72.6 46 27.4 168 9.8 

1950-59- 229 76.9 69 23.1 298 17.4 

1960-68- 427 77.2 126 22.8 553 32.3 

7-series».-- 169 70.1 72 29.9 241 14.1 



Total 1,320 77.2 390 22.8 1,710 100.0 



» Includes I9«9 and 1970 cards. 



Table D. 6 — Number and percentage of English and non-English cards in a sample of LC card orders^ by frequency / orders 

English titles Non-English titles Total 

Frequency of orders 



400 to 499- 
300 to 399- 
200 to 299- 
100 to 199_ 
90 to 99--- 
80 to 89--- 
70 to 79,-- 
60 to 69- 
50 to 59--- 
40 to 49--- 
30 to 39--- 
20 to 29--- 
10 to 19--- 

9 _- 

8 

7 -_ 

b 

5„ -- 

4 -- 

3 -- 

2 

1 ._ 



Number Percentage Number Percentage Number Percentage 



1 


100. 0 






1 


0. 1 


2 


100. 0 






2 


. 1 


5 


100. 0 






5 


.3 


19 


100.0 






19 


1. 1 


it 


100. 0 






4 


. 2 


6 


TOO. 0 






6 


. 4 


7 


100 0 






7 


. 4 


11 


100. 0 - - 






11 


. 6 


15 


100. 0 






15 


, 9 


21 


100. 0 






21 


1, 2 


29 


9r;. 6 


2 


6. 4 


31 


1. 8 


52 


100. 0 






5? 


3. 0 


121 


96.8 


4 


3. 2 


125 


7. 3 


23 


85. 2 


4 


14. 8 


27 


1.6 


28 


90.3 


3 


9. 7 


31 


1. 8 


35 


89.7 


4 


10. 3 


39 


2. 3 


43 


86. 0 


7 


14. 0 


50 


2. 9 


62 


89.9 


7 


10. 1 


69 


4. 0 


71 


81. 6 


16 


18. 4 


87 


5. 1 


115 


81. 0 


27 


19. 0 


142 


8. 3 


191 


69. 7 


83 


30. 3 


274 


16. 0 


459 


66. 3 


233 


33. 7 


692 


40. 5 



Total 1,320 77.2 390 22.8 1,710 99.9 
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percent) doe^ not mean that foreign lanf^age titles 
are not needed by American libraries. 

Tables D.7 and D.8 are related to the 1,000 
most frequently ordered cards. Eight of the 
printed cards were not available from the Card 
Division and two had been superseded by later re- 
visions of the same titles. Therefore the 1,000 most 
frequently ordered cards were reduced to 990 titles 
as shown in the tables. All 990 cards were p]nglish 
and 90 percent (887) had the marc notation on 
them. The range of orders was from 3,280 to 470. 
Ninety-four percent (933) were monographs; 6 
percent (55) were serials and two titles w^re 
atlases. 

Table D. 7 — Number and percentage of 1 fiOO most frequently 
ordered cards^ by period of series 



Period of series N imber Percentage 



1904-09 ___ 5 0.5 

1910-19 2 .2 

1920-29 3 . 3 

1930-39 4 . 4 

1940-4<» 5 .5 

1950-59 7 .7 

1960-68 91 9. 2 

7-series > 873 88. 2 



Total ._. 990 100.0 



1 Includes 1969 and 1970 cards. 



Tabl£ D. ^Number and percentage of lyOOO most frequently 
ordered cards, by LC classification 



Percentage of 

reclassification Number Percentage current LC 

catalog'iK 



A 5 0. 5 0. 8 

B 42 4. 2 7. 5 

C 14 1. 4 J. I 

D ___ 61 6.2 10.7 

E- 136 13.7 1.6 

F 14 1. 4 2. I 

O 22 2.2 2.9 

H... 187 18. 9 13. 3 

J 17 1.7 2.5 

KF 15 1.5 3.1 

62 6.3 3.3 

M 19 1.9 2.8 

N 21 2.1 4.5 

P 179 18. 1 22. 5 

Q -- 68 6. 9 6. 1 

R 29 2. 9 2. \ 

S 12 1.2 1.6 

T 38 3. 8 8. 5 

U 14 1. 4 .4 

V. 2 .2 .3 

7, 32 3. 2 2. 2 

Law 1 .1 (1) 



99. 8 99. 9 



Total 990 



1 Figure not available. 
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Conversion strategy, 3, 30-32 
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ence, o, 32 
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Nonsystematlc conversion, 30-^1 

NUC function in relation to level of MARC record, 6 



Elimination of ineligible records from other data bases, 
13-15 

Format recognition, 11, 14-15 
Funding. 3, 32 

Inconsistencies in NUC reports, 34-35 

Indexes as an alternative to full conversion, 31 

Indexes to NUC : cumulation, 22 ; data elements, 20-21 ; 

master index record, 21-22 ; sorting, 24-25 ; types, 

19-20 ; updating, 23 

Keying ; see Typing 

LC card orders, analysis of, 44-47 

L'C catalog cards, changes by other libraries, 34 

I>evels of machine-readable records, 2, 4-6, 31 ; definition, 4 

l/ocation reports for Nl'C, 19, 23-24, 38-41 

Manpower costs, 15 
MARC Distribution Service, 22, 23 
MARC riK?ords in other data bases, 7 
Master index record, 21-22 
Microstorage with machine indexes, 31 
Model NUC network, 28 



Othpr machine-readable data bases* characteristics, 10, 
15 ; machine formats, 8-9 ; potential, 9 ; recommenda- 
tions about, 32 ; standard for reporting on, 2, 16-17 ; 
survey, 7-11 

Programming costs, 11 
Proofing, 13, 15 

HBCON Advisory Committee, v 
RECON Pilot Project, 30 
RECON studies : funding, iii ; goals, 1 
RECON study, original, 1, 4, 33 
RECON Working Task Force, v 
Register numbers, 21 

Sorting, 24-25 

Systematic conversion, 30-31 
Typing, 13, 15 

Unit costs ; see Manpower costs 
University of Chicago Library, 14-15 
Updating, 15, 23-25 

Verifying, 13, 15 
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